Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
|
Tools and Resources
Share: |
|||||||||||||
| SESSION: Keynote I | ||
| Ricardo Baeza-Yates | ||
| Understanding Human Language: Can NLP and Deep Learning Help? | ||
| Christopher Manning | ||
| Pages: 1-1 | ||
| doi>10.1145/2911451.2926732 | ||
|
Full text: |
||
|
There is a lot of overlap between the core problems of information retrieval (IR) and natural language processing (NLP). An IR system gains from understanding a user need and from understanding documents, and hence being able to determine whether a document ...
expand
|
||
| SESSION: Keynote II | ||
| Susan Dumais | ||
| Big Data in Climate: Opportunities and Challenges for Machine Learning | ||
| Vipin Kumar | ||
| Pages: 3-3 | ||
| doi>10.1145/2911451.2911550 | ||
|
Full text: |
||
|
This talk will present an overview of research being done in a large interdisciplinary project on the development of novel data mining and machine learning approaches for analyzing massive amount of climate and ecosystem data now available from satellite ...
expand
|
||
| SESSION: Evaluation I | ||
| Ben Carterette | ||
| Statistical Significance, Power, and Sample Sizes: A Systematic Review of SIGIR and TOIS, 2006-2015 | ||
| Tetsuya Sakai | ||
| Pages: 5-14 | ||
| doi>10.1145/2911451.2911492 | ||
|
Full text: |
||
|
We conducted a systematic review of 840 SIGIR full papers and 215 TOIS papers published between 2006 and 2015. The original objective of the study was to identify IR effectiveness experiments that are seriously underpowered (i.e., the sample size is ...
expand
|
||
| Bayesian Performance Comparison of Text Classifiers | ||
| Dell Zhang, Jun Wang, Emine Yilmaz, Xiaoling Wang, Yuxin Zhou | ||
| Pages: 15-24 | ||
| doi>10.1145/2911451.2911547 | ||
|
Full text: |
||
|
How can we know whether one classifier is really better than the other? In the area of text classification, since the publication of Yang and Liu's seminal SIGIR-1999 paper, it has become a standard practice for researchers to apply null-hypothesis significance ...
expand
|
||
| A General Linear Mixed Models Approach to Study System Component Effects | ||
| Nicola Ferro, Gianmaria Silvello | ||
| Pages: 25-34 | ||
| doi>10.1145/2911451.2911530 | ||
|
Full text: |
||
|
Topic variance has a greater effect on performances than system variance but it cannot be controlled by system developers who can only try to cope with it. On the other hand, system variance is important on its own, since it is what system developers ...
expand
|
||
| SESSION: Speech and Conversation Systems | ||
| Gareth Jones | ||
| Searching by Talking: Analysis of Voice Queries on Mobile Web Search | ||
| Ido Guy | ||
| Pages: 35-44 | ||
| doi>10.1145/2911451.2911525 | ||
|
Full text: |
||
|
The growing popularity of mobile search and the advancement in voice recognition technologies have opened the door for web search users to speak their queries, rather than type them. While this kind of voice search is still in its infancy, it is gradually ...
expand
|
||
| Predicting User Satisfaction with Intelligent Assistants | ||
| Julia Kiseleva, Kyle Williams, Ahmed Hassan Awadallah, Aidan C. Crook, Imed Zitouni, Tasos Anastasakos | ||
| Pages: 45-54 | ||
| doi>10.1145/2911451.2911521 | ||
|
Full text: |
||
|
There is a rapid growth in the use of voice-controlled intelligent personal assistants on mobile devices, such as Microsoft's Cortana, Google Now, and Apple's Siri. They significantly change the way users interact with search systems, not only because ...
expand
|
||
| Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System | ||
| Rui Yan, Yiping Song, Hua Wu | ||
| Pages: 55-64 | ||
| doi>10.1145/2911451.2911542 | ||
|
Full text: |
||
|
To establish an automatic conversation system between humans and computers is regarded as one of the most hardcore problems in computer science, which involves interdisciplinary techniques in information retrieval, natural language processing, artificial ...
expand
|
||
| SESSION: Retrieval Models | ||
| Maarten de Rijke | ||
| Document Retrieval Using Entity-Based Language Models | ||
| Hadas Raviv, Oren Kurland, David Carmel | ||
| Pages: 65-74 | ||
| doi>10.1145/2911451.2911508 | ||
|
Full text: |
||
|
We address the ad hoc document retrieval task by devising novel types of entity-based language models. The models utilize information about single terms in the query and documents as well as term sequences marked as entities by some entity-linking tool. ...
expand
|
||
| Engineering Quality and Reliability in Technology-Assisted Review | ||
| Gordon V. Cormack, Maura R. Grossman | ||
| Pages: 75-84 | ||
| doi>10.1145/2911451.2911510 | ||
|
Full text: |
||
|
The objective of technology-assisted review ("TAR") is to find as much relevant information as possible with reasonable effort. Quality is a measure of the extent to which a TAR method achieves this objective, while reliability is a measure of how consistently ...
expand
|
||
| A Sequential Decision Formulation of the Interface Card Model for Interactive IR | ||
| Yinan Zhang, Chengxiang Zhai | ||
| Pages: 85-94 | ||
| doi>10.1145/2911451.2911543 | ||
|
Full text: |
||
|
The Interface Card model is a promising new theoretical framework for modeling and optimizing interactive retrieval interfaces, but how to systematically instantiate it to solve concrete interface optimization problems remains an open challenge. We propose ...
expand
|
||
| SESSION: Learning-to-rank | ||
| Mattew Lease | ||
| Generalized BROOF-L2R: A General Framework for Learning to Rank Based on Boosting and Random Forests | ||
| Clebson C.A. de Sá, Marcos A. Gonçalves, Daniel X. Sousa, Thiago Salles | ||
| Pages: 95-104 | ||
| doi>10.1145/2911451.2911540 | ||
|
Full text: |
||
|
The task of retrieving information that really matters to the users is considered hard when taking into consideration the current and increasingly amount of available information. To improve the effectiveness of this information seeking task, systems ...
expand
|
||
| An Optimization Framework for Remapping and Reweighting Noisy Relevance Labels | ||
| Yury Ustinovskiy, Valentina Fedorova, Gleb Gusev, Pavel Serdyukov | ||
| Pages: 105-114 | ||
| doi>10.1145/2911451.2911501 | ||
|
Full text: |
||
|
Relevance labels is the essential part of any learning to rank framework. The rapid development of crowdsourcing platforms led to a significant reduction of the cost of manual labeling. This makes it possible to collect very large sets of labeled documents ...
expand
|
||
| Learning to Rank with Selection Bias in Personal Search | ||
| Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork | ||
| Pages: 115-124 | ||
| doi>10.1145/2911451.2911537 | ||
|
Full text: |
||
|
Click-through data has proven to be a critical resource for improving search ranking quality. Though a large amount of click data can be easily collected by search engines, various biases make it difficult to fully leverage this type of data. In the ...
expand
|
||
| SESSION: Music and Math | ||
| Jaap Kamps | ||
| On Effective Personalized Music Retrieval by Exploring Online User Behaviors | ||
| Zhiyong Cheng, Shen Jialie, Steven C.H. Hoi | ||
| Pages: 125-134 | ||
| doi>10.1145/2911451.2911491 | ||
|
Full text: |
||
|
In this paper, we study the problem of personalized text based music retrieval which takes users' music preferences on songs into account via the analysis of online listening behaviours and social tags. Towards the goal, a novel Dual-Layer Music Preference ...
expand
|
||
| Semantification of Identifiers in Mathematics for Better Math Information Retrieval | ||
| Moritz Schubotz, Alexey Grigorev, Marcus Leich, Howard S. Cohl, Norman Meuschke, Bela Gipp, Abdou S. Youssef, Volker Markl | ||
| Pages: 135-144 | ||
| doi>10.1145/2911451.2911503 | ||
|
Full text: |
||
|
Mathematical formulae are essential in science, but face challenges of ambiguity, due to the use of a small number of identifiers to represent an immense number of concepts. Corresponding to word sense disambiguation in Natural Language Processing, we ...
expand
|
||
| Multi-Stage Math Formula Search: Using Appearance-Based Similarity Metrics at Scale | ||
| Richard Zanibbi, Kenny Davila, Andrew Kane, Frank Wm. Tompa | ||
| Pages: 145-154 | ||
| doi>10.1145/2911451.2911512 | ||
|
Full text: |
||
|
When using a mathematical formula for search (query-by-expression), the suitability of retrieved formulae often depends more upon symbol identities and layout than deep mathematical semantics. Using a Symbol Layout Tree representation for formula appearance, ...
expand
|
||
| SESSION: Microblog | ||
| Mark D. Smucker | ||
| Explainable User Clustering in Short Text Streams | ||
| Yukun Zhao, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz, Maarten de Rijke | ||
| Pages: 155-164 | ||
| doi>10.1145/2911451.2911522 | ||
|
Full text: |
||
|
User clustering has been studied from different angles: behavior-based, to identify similar browsing or search patterns, and content-based, to identify shared interests. Once user clusters have been found, they can be used for recommendation and personalization. ...
expand
|
||
| Topic Modeling for Short Texts with Auxiliary Word Embeddings | ||
| Chenliang Li, Haoran Wang, Zhiqian Zhang, Aixin Sun, Zongyang Ma | ||
| Pages: 165-174 | ||
| doi>10.1145/2911451.2911499 | ||
|
Full text: |
||
|
For many applications that require semantic understanding of short texts, inferring discriminative and coherent latent topics from short texts is a critical and fundamental task. Conventional topic models largely rely on word co-occurrences to derive ...
expand
|
||
| Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams | ||
| Xin Qian, Jimmy Lin, Adam Roegiest | ||
| Pages: 175-184 | ||
| doi>10.1145/2911451.2911494 | ||
|
Full text: |
||
|
We propose and validate a novel interleaved evaluation methodology for two complementary information seeking tasks on document streams: retrospective summarization and prospective notification. In the first, the user desires relevant and non-redundant ...
expand
|
||
| SESSION: Web Search | ||
| David Hawking | ||
| Learning Query and Document Relevance from a Web-scale Click Graph | ||
| Shan Jiang, Yuening Hu, Changsung Kang, Tim Daly, Jr., Dawei Yin, Yi Chang, Chengxiang Zhai | ||
| Pages: 185-194 | ||
| doi>10.1145/2911451.2911531 | ||
|
Full text: |
||
|
Click-through logs over query-document pairs provide rich and valuable information for multiple tasks in information retrieval. This paper proposes a vector propagation algorithm on the click graph to learn vector representations for both queries and ...
expand
|
||
| Click-based Hot Fixes for Underperforming Torso Queries | ||
| Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, Maarten de Rijke | ||
| Pages: 195-204 | ||
| doi>10.1145/2911451.2911500 | ||
|
Full text: |
||
|
Ranking documents using their historical click-through rate (CTR) can improve relevance for frequently occurring queries, i.e., so-called head queries. It is difficult to use such click signals on non-head queries as they receive fewer clicks. In this ...
expand
|
||
|
|
A Context-aware Time Model for Web Search | |
| Alexey Borisov, Ilya Markov, Maarten de Rijke, Pavel Serdyukov | ||
| Pages: 205-214 | ||
| doi>10.1145/2911451.2911504 | ||
|
Full text: |
||
|
In web search, information about times between user actions has been shown to be a good indicator of users' satisfaction with the search results. Existing work uses the mean values of the observed times, or fits probability distributions to the observed ...
expand
|
||
| SESSION: Question Answering | ||
| Hideo Joho | ||
| Novelty based Ranking of Human Answers for Community Questions | ||
| Adi Omari, David Carmel, Oleg Rokhlenko, Idan Szpektor | ||
| Pages: 215-224 | ||
| doi>10.1145/2911451.2911506 | ||
|
Full text: |
||
|
Questions and their corresponding answers within a community based question answering (CQA) site are frequently presented as top search results forWeb search queries and viewed by millions of searchers daily. The number of answers for CQA questions ranges ...
expand
|
||
| That's Not My Question: Learning to Weight Unmatched Terms in CQA Vertical Search | ||
| Boaz Petersil, Avihai Mejer, Idan Szpektor, Koby Crammer | ||
| Pages: 225-234 | ||
| doi>10.1145/2911451.2911496 | ||
|
Full text: |
||
|
A fundamental task in Information Retrieval (IR) is term weighting. Early IR theory considered both the presence or absence of all terms in the lexicon for ranking and needed to weight them all. Yet, as the size of lexicons grew and models became too ...
expand
|
||
| When a Knowledge Base Is Not Enough: Question Answering over Knowledge Bases with External Text Data | ||
| Denis Savenkov, Eugene Agichtein | ||
| Pages: 235-244 | ||
| doi>10.1145/2911451.2911536 | ||
|
Full text: |
||
|
One of the major challenges for automated question answering over Knowledge Bases (KBQA) is translating a natural language question to the Knowledge Base (KB) entities and predicates. Previous systems have used a limited amount of training data to learn ...
expand
|
||
| SESSION: Learning | ||
| Emine Yilmaz | ||
| Transfer Learning for Cross-Lingual Sentiment Classification with Weakly Shared Deep Neural Networks | ||
| Guangyou Zhou, Zhao Zeng, Jimmy Xiangji Huang, Tingting He | ||
| Pages: 245-254 | ||
| doi>10.1145/2911451.2911490 | ||
|
Full text: |
||
|
Cross-lingual sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of data in a label-scarce target language by exploiting labeled data from a label-rich language. The fundamental challenge of cross-lingual ...
expand
|
||
| Query to Knowledge: Unsupervised Entity Extraction from Shopping Queries using Adaptor Grammars | ||
| Ke Zhai, Zornitsa Kozareva, Yuening Hu, Qi Li, Weiwei Guo | ||
| Pages: 255-264 | ||
| doi>10.1145/2911451.2911495 | ||
|
Full text: |
||
|
Web search queries provide a surprisingly large amount of information, which can be potentially organized and converted into a knowledgebase. In this paper, we focus on the problem of automatically identifying brand and product entities from a large ...
expand
|
||
| Learning for Efficient Supervised Query Expansion via Two-stage Feature Selection | ||
| Zhiwei Zhang, Qifan Wang, Luo Si, Jianfeng Gao | ||
| Pages: 265-274 | ||
| doi>10.1145/2911451.2911539 | ||
|
Full text: |
||
|
Query expansion (QE) is a well known technique to improve retrieval effectiveness, which expands original queries with extra terms that are predicted to be relevant. A recent trend in the literature is Supervised Query Expansion (SQE), where supervised ...
expand
|
||
| SESSION: Efficiency I | ||
| Alistair Moffat | ||
| Leveraging Context-Free Grammar for Efficient Inverted Index Compression | ||
| Zhaohua Zhang, Jiancong Tong, Haibing Huang, Jin Liang, Tianlong Li, Rebecca J. Stones, Gang Wang, Xiaoguang Liu | ||
| Pages: 275-284 | ||
| doi>10.1145/2911451.2911518 | ||
|
Full text: |
||
|
Large-scale search engines need to answer thousands of queries per second over billions of documents, which is typically done by querying a large inverted index. Many highly optimized integer encoding techniques are applied to compress the inverted index ...
expand
|
||
| Fast and Compact Hamming Distance Index | ||
| Simon Gog, Rossano Venturini | ||
| Pages: 285-294 | ||
| doi>10.1145/2911451.2911523 | ||
|
Full text: |
||
|
Searching for similar objects in a collection is a core task of many applications in databases, pattern recognition, and information retrieval. As there exist similarity-preserving hash functions like SimHash, indexing these objects reduces to the solution ...
expand
|
||
| Fast First-Phase Candidate Generation for Cascading Rankers | ||
| Qi Wang, Constantinos Dimopoulos, Torsten Suel | ||
| Pages: 295-304 | ||
| doi>10.1145/2911451.2911515 | ||
|
Full text: |
||
|
Current search engines use very complex ranking functions based on hundreds of features. While such functions return high-quality results, they create efficiency challenges as it is too costly to fully evaluate them on all documents in the union, or ...
expand
|
||
| SESSION: Recommendation Systems I | ||
| Oren Kurland | ||
| Learning to Rank Features for Recommendation over Multiple Categories | ||
| Xu Chen, Zheng Qin, Yongfeng Zhang, Tao Xu | ||
| Pages: 305-314 | ||
| doi>10.1145/2911451.2911549 | ||
|
Full text: |
||
|
Incorporating phrase-level sentiment analysis on users' textual reviews for recommendation has became a popular meth-od due to its explainable property for latent features and high prediction accuracy. However, the inherent limitations of the existing ...
expand
|
||
| How Much Novelty is Relevant?: It Depends on Your Curiosity | ||
| Pengfei Zhao, Dik Lun Lee | ||
| Pages: 315-324 | ||
| doi>10.1145/2911451.2911488 | ||
|
Full text: |
||
|
Traditional recommendation systems (RS's) aim to recommend items that are relevant to the user's interest. Unfortunately, the recommended items will soon become too familiar to the user and hence fail to arouse her interest. Discovery-oriented recommendation ...
expand
|
||
| Discrete Collaborative Filtering | ||
| Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, Tat-Seng Chua | ||
| Pages: 325-334 | ||
| doi>10.1145/2911451.2911502 | ||
|
Full text: |
||
|
We address the efficiency problem of Collaborative Filtering (CF) by hashing users and items as latent vectors in the form of binary codes, so that user-item affinity can be efficiently calculated in a Hamming space. However, existing hashing methods ...
expand
|
||
| SESSION: User Needs | ||
| Diane Kelly | ||
|
|
Understanding Information Need: An fMRI Study | |
| Yashar Moshfeghi, Peter Triantafillou, Frank E. Pollick | ||
| Pages: 335-344 | ||
| doi>10.1145/2911451.2911534 | ||
|
Full text: |
||
|
The raison d'etre of IR is to satisfy human information need. But, do we really understand information need? Despite advances in the past few decades in both the IR and relevant scientific communities, this question is largely unanswered. We do not really ...
expand
|
||
| User Behavior in Asynchronous Slow Search | ||
| Ryan Burton, Kevyn Collins-Thompson | ||
| Pages: 345-354 | ||
| doi>10.1145/2911451.2911541 | ||
|
Full text: |
||
|
Conventional Web search is predicated on returning results to users as quickly as possible. However, for some search tasks, users have reported a willingness to wait for the perfect set of results. In this work, we present the first study to analyze ...
expand
|
||
| Going back in Time: An Investigation of Social Media Re-finding | ||
| Florian Meier, David Elsweiler | ||
| Pages: 355-364 | ||
| doi>10.1145/2911451.2911524 | ||
|
Full text: |
||
|
Social Media (SM) has become a valuable information source to many in diverse situations. In IR, research has focused on real-time aspects and as such little is known about how long SM content is of value to users, if and how often it is re-accessed, ...
expand
|
||
| SESSION: Privacy, Advertising, and Products | ||
| Grace Hui Yang | ||
| R-Susceptibility: An IR-Centric Approach to Assessing Privacy Risks for Users in Online Communities | ||
| Joanna Asia Biega, Krishna P. Gummadi, Ida Mele, Dragan Milchevski, Christos Tryfonopoulos, Gerhard Weikum | ||
| Pages: 365-374 | ||
| doi>10.1145/2911451.2911533 | ||
|
Full text: |
||
|
Privacy of Internet users is at stake because they expose personal information in posts created in online communities, in search queries, and other activities. An adversary that monitors a community may identify the users with the most sensitive properties ...
expand
|
||
| Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising | ||
| Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, Ricardo Baeza-Yates, Andrew Feng, Erik Ordentlich, Lee Yang, Gavin Owens | ||
| Pages: 375-384 | ||
| doi>10.1145/2911451.2911538 | ||
|
Full text: |
||
|
Sponsored search represents a major source of revenue for web search engines. The advertising model brings a unique possibility for advertisers to target direct user intent communicated through a search query, usually done by displaying their ads alongside ...
expand
|
||
| Retrieving Non-Redundant Questions to Summarize a Product Review | ||
| Mengwen Liu, Yi Fang, Dae Hoon Park, Xiaohua Hu, Zhengtao Yu | ||
| Pages: 385-394 | ||
| doi>10.1145/2911451.2911544 | ||
|
Full text: |
||
|
Product reviews have become an important resource for customers before they make purchase decisions. However, the abundance of reviews makes it difficult for customers to digest them and make informed choices. In our study, we aim to help customers who ...
expand
|
||
| SESSION: Novelty and Diversity | ||
| Charlie L.A. Clarke | ||
| Modeling Document Novelty with Neural Tensor Network for Search Result Diversification | ||
| Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng | ||
| Pages: 395-404 | ||
| doi>10.1145/2911451.2911498 | ||
|
Full text: |
||
|
Search result diversification has attracted considerable attention as a means to tackle the ambiguous or multi-faceted information needs of users. One of the key problems in search result diversification is novelty, that is, how to measure the novelty ...
expand
|
||
| ScentBar: A Query Suggestion Interface Visualizing the Amount of Missed Relevant Information for Intrinsically Diverse Search | ||
| Kazutoshi Umemoto, Takehiro Yamamoto, Katsumi Tanaka | ||
| Pages: 405-414 | ||
| doi>10.1145/2911451.2911546 | ||
|
Full text: |
||
|
For intrinsically diverse tasks, in which collecting extensive information from different aspects of a topic is required, searchers often have difficulty formulating queries to explore diverse aspects and deciding when to stop searching. With the goal ...
expand
|
||
| Evaluating Search Result Diversity using Intent Hierarchies | ||
| Xiaojie Wang, Zhicheng Dou, Tetsuya Sakai, Ji-Rong Wen | ||
| Pages: 415-424 | ||
| doi>10.1145/2911451.2911497 | ||
|
Full text: |
||
|
Search result diversification aims at returning diversified document lists to cover different user intents for ambiguous or broad queries. Existing diversity measures assume that user intents are independent or exclusive, and do not consider the relationships ...
expand
|
||
| SESSION: Entities and Knowledge Graphs | ||
| Jamie Callan | ||
| Robust and Collective Entity Disambiguation through Semantic Embeddings | ||
| Stefan Zwicklbauer, Christin Seifert, Michael Granitzer | ||
| Pages: 425-434 | ||
| doi>10.1145/2911451.2911535 | ||
|
Full text: |
||
|
Entity disambiguation is the task of mapping ambiguous terms in natural-language text to its entities in a knowledge base. It finds its application in the extraction of structured data in RDF (Resource Description Framework) from textual documents, but ...
expand
|
||
| Parameterized Fielded Term Dependence Models for Ad-hoc Entity Retrieval from Knowledge Graph | ||
| Fedor Nikolaev, Alexander Kotov, Nikita Zhiltsov | ||
| Pages: 435-444 | ||
| doi>10.1145/2911451.2911545 | ||
|
Full text: |
||
|
Accurate projection of terms in free-text queries onto structured entity representations is one of the fundamental problems in entity retrieval from knowledge graphs. In this paper, we demonstrate that existing retrieval models for ad-hoc structured ...
expand
|
||
| Hierarchical Random Walk Inference in Knowledge Graphs | ||
| Qiao Liu, Liuyi Jiang, Minghao Han, Yao Liu, Zhiguang Qin | ||
| Pages: 445-454 | ||
| doi>10.1145/2911451.2911509 | ||
|
Full text: |
||
|
Relational inference is a crucial technique for knowledge base population. The central problem in the study of relational inference is to infer unknown relations between entities from the facts given in the knowledge bases. Two popular models have been ...
expand
|
||
| SESSION: SIRIP I: Big companies, big data | ||
| Gilad Mishne | ||
| When Watson Went to Work: Leveraging Cognitive Computing in the Real World | ||
| Aya Soffer, David Konopnicki, Haggai Roitman | ||
| Pages: 455-456 | ||
| doi>10.1145/2911451.2926724 | ||
|
Full text: |
||
| Ask Your TV: Real-Time Question Answering with Recurrent Neural Networks | ||
| Ferhan Ture, Oliver Jojic | ||
| Pages: 457-458 | ||
| doi>10.1145/2911451.2926729 | ||
|
Full text: |
||
|
Voice-based interfaces are very popular in today's world, and Comcast customers are no exception. Usage stats show that our new X1 TV platform receives millions of voice queries per day. As a result, expanding the coverage of our voice interface provides ...
expand
|
||
| Amazon Search: The Joy of Ranking Products | ||
| Daria Sorokina, Erick Cantu-Paz | ||
| Pages: 459-460 | ||
| doi>10.1145/2911451.2926725 | ||
|
Full text: |
||
|
Amazon is one of the world's largest e-commerce sites and Amazon Search powers the majority of Amazon's sales. As a consequence, even small improvements in relevance ranking both positively influence the shopping experience of millions of customers and ...
expand
|
||
| Learning to Rank Personalized Search Results in Professional Networks | ||
| Viet Ha-Thuc, Shakti Sinha | ||
| Pages: 461-462 | ||
| doi>10.1145/2911451.2927018 | ||
|
Full text: |
||
|
LinkedIn search is deeply personalized - for the same queries, different searchers expect completely different results. This paper presents our approach to achieving this by mining various data sources available in LinkedIn to infer searchers' intents ...
expand
|
||
| SESSION: Evaluation II | ||
| Tetsuya Sakai | ||
| When does Relevance Mean Usefulness and User Satisfaction in Web Search? | ||
| Jiaxin Mao, Yiqun Liu, Ke Zhou, Jian-Yun Nie, Jingtao Song, Min Zhang, Shaoping Ma, Jiashen Sun, Hengliang Luo | ||
| Pages: 463-472 | ||
| doi>10.1145/2911451.2911507 | ||
|
Full text: |
||
|
Relevance is a fundamental concept in information retrieval (IR) studies. It is however often observed that relevance as annotated by secondary assessors may not necessarily mean usefulness and satisfaction perceived by users. In this study, we confirm ...
expand
|
||
| How Many Workers to Ask?: Adaptive Exploration for Collecting High Quality Labels | ||
| Ittai Abraham, Omar Alonso, Vasilis Kandylas, Rajesh Patel, Steven Shelford, Aleksandrs Slivkins | ||
| Pages: 473-482 | ||
| doi>10.1145/2911451.2911514 | ||
|
Full text: |
||
|
Crowdsourcing has been part of the IR toolbox as a cheap and fast mechanism to obtain labels for system development and evaluation. Successful deployment of crowdsourcing at scale involves adjusting many variables, a very important one being the number ...
expand
|
||
| Risk-Sensitive Evaluation and Learning to Rank using Multiple Baselines | ||
| B. Taner Dinçer, Craig Macdonald, Iadh Ounis | ||
| Pages: 483-492 | ||
| doi>10.1145/2911451.2911511 | ||
|
Full text: |
||
|
A robust retrieval system ensures that user experience is not damaged by the presence of poorly-performing queries. Such robustness can be measured by risk-sensitive evaluation measures, which assess the extent to which a system performs worse than a ...
expand
|
||
| SESSION: Events | ||
| Fernando Diaz | ||
| Event Digest: A Holistic View on Past Events | ||
| Arunav Mishra, Klaus Berberich | ||
| Pages: 493-502 | ||
| doi>10.1145/2911451.2911526 | ||
|
Full text: |
||
|
For a general user, easy access to vast amounts of online information available on past events has made retrospection much harder. We propose a problem of automatic event digest generation to aid effective and efficient retrospection. For this, in addition ...
expand
|
||
| Terms over LOAD: Leveraging Named Entities for Cross-Document Extraction and Summarization of Events | ||
| Andreas Spitz, Michael Gertz | ||
| Pages: 503-512 | ||
| doi>10.1145/2911451.2911529 | ||
|
Full text: |
||
|
Real world events, such as historic incidents, typically contain both spatial and temporal aspects and involve a specific group of persons. This is reflected in the descriptions of events in textual sources, which contain mentions of named entities and ...
expand
|
||
| GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams | ||
| Chao Zhang, Guangyu Zhou, Quan Yuan, Honglei Zhuang, Yu Zheng, Lance Kaplan, Shaowen Wang, Jiawei Han | ||
| Pages: 513-522 | ||
| doi>10.1145/2911451.2911519 | ||
|
Full text: |
||
|
The real-time discovery of local events (e.g., protests, crimes, disasters) is of great importance to various applications, such as crime monitoring, disaster alarming, and activity recommendation. While this task was nearly impossible years ago due ...
expand
|
||
| SESSION: SIRIP II: Small companies, big ideas | ||
| Gilad Mishne | ||
| Building a Self-Learning Search Engine: From Research to Business | ||
| Manos Tsagkias, Wouter Weerkamp | ||
| Pages: 523-524 | ||
| doi>10.1145/2911451.2926728 | ||
|
Full text: |
||
|
904Labs B.V. was founded in 2014 by Wouter Weerkamp, Manos Tsagkias, and Maarten de Rijke to commercialize state-of-the-art search engine technology. 904Labs' strategic product is a self-learning search engine for online retailers, which uses some of ...
expand
|
||
| Sedano: A News Stream Processor for Business | ||
| Ugo Scaiella, Giacomo Berardi, Giuliano Mega, Roberto Santoro | ||
| Pages: 525-526 | ||
| doi>10.1145/2911451.2926730 | ||
|
Full text: |
||
|
We present Sedano, a system for processing and indexing a continuous stream of business-related news. Sedano defines pipelines whose stages analyze and enrich news items (e.g., newspaper articles and press releases). News data coming from several content ...
expand
|
||
| Ranking Financial Tweets | ||
| Diego Ceccarelli, Francesco Nidito, Miles Osborne | ||
| Pages: 527-528 | ||
| doi>10.1145/2911451.2926727 | ||
|
Full text: |
||
|
Recently Twitter has complemented traditional newswire as a source of valuable Financial information. Although there is a rich body of published research dealing with the task of ranking tweets, there has been little published research dealing with ranking ...
expand
|
||
| SESSION: Recommendation Systems II | ||
| Josiane Mothe | ||
| Contextual Bandits in a Collaborative Environment | ||
| Qingyun Wu, Huazheng Wang, Quanquan Gu, Hongning Wang | ||
| Pages: 529-538 | ||
| doi>10.1145/2911451.2911528 | ||
|
Full text: |
||
|
Contextual bandit algorithms provide principled online learning solutions to find optimal trade-offs between exploration and exploitation with companion side-information. They have been extensively used in many important practical scenarios, such as ...
expand
|
||
| Collaborative Filtering Bandits | ||
| Shuai Li, Alexandros Karatzoglou, Claudio Gentile | ||
| Pages: 539-548 | ||
| doi>10.1145/2911451.2911548 | ||
|
Full text: |
||
|
Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and computational ...
expand
|
||
| Fast Matrix Factorization for Online Recommendation with Implicit Feedback | ||
| Xiangnan He, Hanwang Zhang, Min-Yen Kan, Tat-Seng Chua | ||
| Pages: 549-558 | ||
| doi>10.1145/2911451.2911489 | ||
|
Full text: |
||
|
This paper contributes improvements on both the effectiveness and efficiency of Matrix Factorization (MF) methods for implicit feedback. We highlight two critical issues of existing works. First, due to the large space of unobserved feedback, most existing ...
expand
|
||
| SESSION: Image and Multimodal Search | ||
| Gabriella Pasi | ||
| Leveraging User Interaction Signals for Web Image Search | ||
| Neil O'Hare, Paloma de Juan, Rossano Schifanella, Yunlong He, Dawei Yin, Yi Chang | ||
| Pages: 559-568 | ||
| doi>10.1145/2911451.2911532 | ||
|
Full text: |
||
|
User interfaces for web image search engine results differ significantly from interfaces for traditional (text) web search results, supporting a richer interaction. In particular, users can see an enlarged image preview by hovering over a result image, ...
expand
|
||
| Self-Paced Cross-Modal Subspace Matching | ||
| Jian Liang, Zhihang Li, Dong Cao, Ran He, Jingdong Wang | ||
| Pages: 569-578 | ||
| doi>10.1145/2911451.2911527 | ||
|
Full text: |
||
|
Cross-modal matching methods match data from different modalities according to their similarities. Most existing methods utilize label information to reduce the semantic gap between different modalities. However, it is usually time-consuming to manually ...
expand
|
||
| Composite Correlation Quantization for Efficient Multimodal Retrieval | ||
| Mingsheng Long, Yue Cao, Jianmin Wang, Philip S. Yu | ||
| Pages: 579-588 | ||
| doi>10.1145/2911451.2911493 | ||
|
Full text: |
||
|
Efficient similarity retrieval from large-scale multimodal database is pervasive in modern search engines and social networks. To support queries across content modalities, the system should enable cross-modal correlation and computation-efficient indexing. ...
expand
|
||
| SESSION: SIRIP III: Modeling and Evaluation | ||
| Jussi Karlgren | ||
| Principles for the Design of Online A/B Metrics | ||
| Widad Machmouchi, Georg Buscher | ||
| Pages: 589-590 | ||
| doi>10.1145/2911451.2926731 | ||
|
Full text: |
||
|
In this paper, we describe principles for designing metrics in the context of A/B experiments. We share some issues that comes up in designing such experiments and provide solutions to avoid such pitfalls.
expand
|
||
| Visual Recommendation Use Case for an Online Marketplace Platform: allegro.pl | ||
| Anna Wróblewska, Łukasz Rączkowski | ||
| Pages: 591-594 | ||
| doi>10.1145/2911451.2926722 | ||
|
Full text: |
||
|
In this paper we describe a small content-based visual recommendation project built as part of the Allegro online marketplace platform. We extracted relevant data only from images, as they are inherently better at capturing visual attributes than textual ...
expand
|
||
| AOL's Named Entity Resolver: Solving Disambiguation via Document Strongly Connected Components and Ad-Hoc Edges Construction | ||
| Roni Wiener, Yonatan Ben-Simhon, Anna Chen | ||
| Pages: 595-596 | ||
| doi>10.1145/2911451.2926721 | ||
|
Full text: |
||
|
Named Entity Disambiguation is the task of disambiguating named entity mentions in unstructured text and linking them to their corresponding entries in a large knowledge base such as Freebase. Practically, each text match in a given document should be ...
expand
|
||
| The Data Stack in Information Retrieval | ||
| Omar Alonso | ||
| Pages: 597-597 | ||
| doi>10.1145/2911451.2926726 | ||
|
Full text: |
||
|
I propose to look at information retrieval applications from the perspective of the data stack infrastructure that is needed in research prototypes and production systems.
expand
|
||
| SESSION: Behavior Models and Applications | ||
| David Elsweiler | ||
| Predicting User Engagement with Direct Displays Using Mouse Cursor Information | ||
| Ioannis Arapakis, Luis A. Leiva | ||
| Pages: 599-608 | ||
| doi>10.1145/2911451.2911505 | ||
|
Full text: |
||
|
Predicting user engagement with direct displays (DD) is of paramount importance to commercial search engines, as well as to search performance evaluation. However, understanding within-content engagement on a web page is not a trivial task mainly because ...
expand
|
||
| Search Result Prefetching Using Cursor Movement | ||
| Fernando Diaz, Qi Guo, Ryen W. White | ||
| Pages: 609-618 | ||
| doi>10.1145/2911451.2911516 | ||
|
Full text: |
||
|
Search result examination is an important part of searching. High page load latency for landing pages (clicked results) can reduce the efficiency of the search process. Proactively prefetching landing pages in advance of clickthrough can save searchers ...
expand
|
||
| Predicting Search User Examination with Visual Saliency | ||
| Yiqun Liu, Zeyang Liu, Ke Zhou, Meng Wang, Huanbo Luan, Chao Wang, Min Zhang, Shaoping Ma | ||
| Pages: 619-628 | ||
| doi>10.1145/2911451.2911517 | ||
|
Full text: |
||
|
Predicting users' examination of search results is one of the key concerns in Web search related studies. With more and more heterogeneous components federated into search engine result pages (SERPs), it becomes difficult for traditional position-based ...
expand
|
||
| SESSION: Efficiency II | ||
| Rossano Venturini | ||
| A Comparison of Cache Blocking Methods for Fast Execution of Ensemble-based Score Computation | ||
| Xin Jin, Tao Yang, Xun Tang | ||
| Pages: 629-638 | ||
| doi>10.1145/2911451.2911520 | ||
|
Full text: |
||
|
Machine-learned classification and ranking techniques often use ensembles to aggregate partial scores of feature vectors for high accuracy and the runtime score computation can become expensive when employing a large number of ensembles. The previous ...
expand
|
||
| Improved Caching Techniques for Large-Scale Image Hosting Services | ||
| Xiao Bai, B. Barla Cambazoglu, Archie Russell | ||
| Pages: 639-648 | ||
| doi>10.1145/2911451.2911513 | ||
|
Full text: |
||
|
Commercial image serving systems, such as Flickr and Facebook, rely on large image caches to avoid the retrieval of requested images from the costly backend image store, as much as possible. Such systems serve the same image in different resolutions ...
expand
|
||
| SESSION: Short Collection Papers | ||
| A Complete & Comprehensive Movie Review Dataset (CCMR) | ||
| Xuezhi Cao, Weiyue Huang, Yong Yu | ||
| Pages: 661-664 | ||
| doi>10.1145/2911451.2914669 | ||
|
Full text: |
||
|
Online review sites are widely used for various domains including movies and restaurants. These sites now have strong influences towards users during purchasing processes. There exist plenty of research works for review sites on various aspects, including ...
expand
|
||
| A Cross-Platform Collection of Social Network Profiles | ||
| Maria Han Veiga, Carsten Eickhoff | ||
| Pages: 665-668 | ||
| doi>10.1145/2911451.2914666 | ||
|
Full text: |
||
|
The proliferation of Internet-enabled devices and services has led to a shifting balance between digital and analogue aspects of our everyday lives. In the face of this development there is a growing demand for the study of privacy hazards, the potential ...
expand
|
||
| A Test Collection for Matching Patients to Clinical Trials | ||
| Bevan Koopman, Guido Zuccon | ||
| Pages: 669-672 | ||
| doi>10.1145/2911451.2914672 | ||
|
Full text: |
||
|
We present a test collection to study the use of search engines for matching eligible patients (the query) to clinical trials (the document). Clinical trials are experiments conducted in the development of new medical treatments, drugs or devices. Recruiting ...
expand
|
||
| ArabicWeb16: A New Crawl for Today's Arabic Web | ||
| Reem Suwaileh, Mucahid Kutlu, Nihal Fathima, Tamer Elsayed, Matthew Lease | ||
| Pages: 673-676 | ||
| doi>10.1145/2911451.2914677 | ||
|
Full text: |
||
|
Web crawls provide valuable snapshots of the Web which enable a wide variety of research, be it distributional analysis to characterize Web properties or use of language, content analysis in social science, or Information Retrieval (IR) research to develop ...
expand
|
||
| Building Test Collections for Evaluating Temporal IR | ||
| Hideo Joho, Adam Jatowt, Roi Blanco, Haitao Yu, Shuhei Yamamoto | ||
| Pages: 677-680 | ||
| doi>10.1145/2911451.2914673 | ||
|
Full text: |
||
|
Research on temporal aspects of information retrieval has recently gained considerable interest within the Information Retrieval (IR) community. This paper describes our efforts for building test collections for the purpose of fostering temporal IR research. ...
expand
|
||
| DAJEE: A Dataset of Joint Educational Entities for Information Retrieval in Technology Enhanced Learning | ||
| Vladimir Estivill-Castro, Carla Limongelli, Matteo Lombardi, Alessandro Marani | ||
| Pages: 681-684 | ||
| doi>10.1145/2911451.2914670 | ||
|
Full text: |
||
|
In the Technology Enhanced Learning (TEL) community, the problem of conducting reproducible evaluations of recommender systems is still open, due to the lack of exhaustive benchmarks. The few public datasets available in TEL have limitations, being mostly ...
expand
|
||
| Evaluating Retrieval over Sessions: The TREC Session Track 2011-2014 | ||
| Ben Carterette, Paul Clough, Mark Hall, Evangelos Kanoulas, Mark Sanderson | ||
| Pages: 685-688 | ||
| doi>10.1145/2911451.2914675 | ||
|
Full text: |
||
|
Information Retrieval (IR) research has traditionally focused on serving the best results for a single query - so-called ad hoc retrieval. However, users typically search iteratively, refining and reformulating their queries during a session. A key challenge ...
expand
|
||
| EveTAR: A New Test Collection for Event Detection in Arabic Tweets | ||
| Hind Almerekhi, Maram Hasanain, Tamer Elsayed | ||
| Pages: 689-692 | ||
| doi>10.1145/2911451.2914681 | ||
|
Full text: |
||
|
Research on event detection in Twitter is often obstructed by the lack of publicly-available evaluation mechanisms such as test collections; this problem is more severe when considering the scarcity of them in languages other than English. In this paper, ...
expand
|
||
| GNMID14: A Collection of 110 Million Global Music Identification Matches | ||
| Cameron Summers, Greg Tronel, Jason Cramer, Aneesh Vartakavi, Phillip Popp | ||
| Pages: 693-696 | ||
| doi>10.1145/2911451.2914679 | ||
|
Full text: |
||
|
A new dataset is presented composed of music identification matches from Gracenote, a leading global music metadata company. Matches from January 1, 2014 to December 31, 2014 have been curated and made available as a public dataset called Gracenote Music ...
expand
|
||
| Longitudinal Navigation Log Data on a Large Web Domain | ||
| Suzan Verberne, Bram Arends, Wessel Kraaij, Arjen de Vries | ||
| Pages: 697-700 | ||
| doi>10.1145/2911451.2914667 | ||
|
Full text: |
||
|
We have collected the access logs for our university's web domain over a time span of 4.5 years. We now release the pre-processed data of a 3-month period for research into user navigation behavior. We preprocessed the data so that only successful GET ...
expand
|
||
| New Collection Announcement: Focused Retrieval Over the Web | ||
| Ivan Habernal, Maria Sukhareva, Fiana Raiber, Anna Shtok, Oren Kurland, Hadar Ronen, Judit Bar-Ilan, Iryna Gurevych | ||
| Pages: 701-704 | ||
| doi>10.1145/2911451.2914682 | ||
|
Full text: |
||
|
Focused retrieval (a.k.a., passage retrieval) is important at its own right and as an intermediate step in question answering systems. We present a new Web-based collection for focused retrieval. The document corpus is the Category A of the ClueWeb12 ...
expand
|
||
| NTCIR Lifelog: The First Test Collection for Lifelog Research | ||
| Cathal Gurrin, Hideo Joho, Frank Hopfgartner, Liting Zhou, Rami Albatal | ||
| Pages: 705-708 | ||
| doi>10.1145/2911451.2914680 | ||
|
Full text: |
||
|
Test collections have a long history of supporting repeatable and comparable evaluation in Information Retrieval (IR). However, thus far, no shared test collection exists for IR systems that are designed to index and retrieve multimodal lifelog data. ...
expand
|
||
| SOGOU-2012-CRAWL: A Crawl of Search Results in the Sogou 2012 Chinese Query Log | ||
| Stewart Whiting, Joemon M. Jose, Omar Alonso | ||
| Pages: 709-712 | ||
| doi>10.1145/2911451.2914668 | ||
|
Full text: |
||
|
In 2012, Sogou, a major Chinese web search engine released a large-scale query log containing 43.5M user interactions, including submitted queries and clicked web page search results. This query log offers a deep sample of queries over a two day period ...
expand
|
||
| The BOLT IR Test Collections of Multilingual Passage Retrieval from Discussion Forums | ||
| Ian Soboroff, Kira Griffitt, Stephanie Strassel | ||
| Pages: 713-716 | ||
| doi>10.1145/2911451.2914674 | ||
|
Full text: |
||
|
This paper describes a new test collection for passage retrieval from multilingual, informal text. The task being modeled is that of a monolingual English-speaking user who wishes to search discussion forum text in a foreign language. The system retrieves ...
expand
|
||
| The Factoid Queries Collection | ||
| Ido Guy, Dan Pelleg | ||
| Pages: 717-720 | ||
| doi>10.1145/2911451.2914676 | ||
|
Full text: |
||
|
We present a collection of over 15,000 queries, issued to commercial web search engines, whose answer is a single fact. The collection was produced based on queries landing on questions within a large community question answering website, each with a ...
expand
|
||
| The LExR Collection for Expertise Retrieval in Academia | ||
| Vitor Mangaravite, Rodrygo L.T. Santos, Isac S. Ribeiro, Marcos André Gonçalves, Alberto H.F. Laender | ||
| Pages: 721-724 | ||
| doi>10.1145/2911451.2914678 | ||
|
Full text: |
||
|
Expertise retrieval has been the subject of intense research over the past decade, particularly with the public availability of benchmark test collections for expertise retrieval in enterprises. Another domain which has seen comparatively less research ...
expand
|
||
| UQV100: A Test Collection with Query Variability | ||
| Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas | ||
| Pages: 725-728 | ||
| doi>10.1145/2911451.2914671 | ||
|
Full text: |
||
|
We describe the UQV100 test collection, designed to incorporate variability from users. Information need ?backstories? were written for 100 topics (or sub-topics) from the TREC 2013 and 2014 Web Tracks. Crowd workers were asked to read the backstories, ...
expand
|
||
| SESSION: Short Research Papers | ||
| A Dynamic Recurrent Model for Next Basket Recommendation | ||
| Feng Yu, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan | ||
| Pages: 729-732 | ||
| doi>10.1145/2911451.2914683 | ||
|
Full text: |
||
|
Next basket recommendation becomes an increasing concern. Most conventional models explore either sequential transaction features or general interests of users. Further, some works treat users' general interests and sequential behaviors as two totally ...
expand
|
||
| A Simple Enhancement for Ad-hoc Information Retrieval via Topic Modelling | ||
| Fanghong Jian, Jimmy Xiangji Huang, Jiashu Zhao, Tingting He, Po Hu | ||
| Pages: 733-736 | ||
| doi>10.1145/2911451.2914748 | ||
|
Full text: |
||
|
Traditional information retrieval (IR) models, in which a document is normally represented as a bag of words and their frequencies, capture the term-level and document-level information. Topic models, on the other hand, discover semantic topic-based ...
expand
|
||
| An Empirical Study of Learning to Rank for Entity Search | ||
| Jing Chen, Chenyan Xiong, Jamie Callan | ||
| Pages: 737-740 | ||
| doi>10.1145/2911451.2914725 | ||
|
Full text: |
||
|
This work investigates the effectiveness of learning to rank methods for entity search. Entities are represented by multi-field documents constructed from their RDF triples, and field-based text similarity features are extracted for query-entity pairs. ...
expand
|
||
| An Exploration of Evaluation Metrics for Mobile Push Notifications | ||
| Luchen Tan, Adam Roegiest, Jimmy Lin, Charles L.A. Clarke | ||
| Pages: 741-744 | ||
| doi>10.1145/2911451.2914694 | ||
|
Full text: |
||
|
How do we evaluate systems that filter social media streams and send users updates via push notifications on their mobile phones? Such notifications must be relevant, timely, and novel. In this paper, we explore various evaluation metrics for this task, ...
expand
|
||
| An Improved Multileaving Algorithm for Online Ranker Evaluation | ||
| Brian Brost, Ingemar J. Cox, Yevgeny Seldin, Christina Lioma | ||
| Pages: 745-748 | ||
| doi>10.1145/2911451.2914706 | ||
|
Full text: |
||
|
Online ranker evaluation is a key challenge in information retrieval. An important task in the online evaluation of rankers is using implicit user feedback for inferring preferences between rankers. Interleaving methods have been found to be efficient ...
expand
|
||
| An Unsupervised Approach to Anomaly Detection in Music Datasets | ||
| Yen-Cheng Lu, Chih-Wei Wu, Chang-Tien Lu, Alexander Lerch | ||
| Pages: 749-752 | ||
| doi>10.1145/2911451.2914700 | ||
|
Full text: |
||
|
This paper presents an unsupervised method for systematically identifying anomalies in music datasets. The model integrates categorical regression and robust estimation techniques to infer anomalous scores in music clips. When applied to a music genre ...
expand
|
||
| Anonymizing Query Logs by Differential Privacy | ||
| Sicong Zhang, Hui Yang, Lisa Singh | ||
| Pages: 753-756 | ||
| doi>10.1145/2911451.2914732 | ||
|
Full text: |
||
|
Query logs are valuable resources for Information Retrieval (IR) research. However, because they are also rich in private and personal information, the huge concern of leaking user privacy prevents query logs from being shared from the search companies ...
expand
|
||
| Audio Features Affected by Music Expressiveness: Experimental Setup and Preliminary Results on Tuba Players | ||
| Alberto Introini, Giorgio Presti, Giuseppe Boccignone | ||
| Pages: 757-760 | ||
| doi>10.1145/2911451.2914690 | ||
|
Full text: |
||
|
Within a Music Information Retrieval perspective, the goal of the study presented here is to investigate the impact on sound features of the musician's affective intention, namely when trying to intentionally convey emotional contents via expressiveness. ...
expand
|
||
| Automatic Identification and Contextual Reformulation of Implicit System-Related Queries | ||
| Adam Fourney, Susan T. Dumais | ||
| Pages: 761-764 | ||
| doi>10.1145/2911451.2914701 | ||
|
Full text: |
||
|
Web search functionality is increasingly integrated into operating systems, software applications, and other interactive environments that extend beyond the traditional web browser. In particular, intelligent virtual assistants (e.g., Microsoft Cortana ...
expand
|
||
| Axiomatic Analysis for Improving the Log-Logistic Feedback Model | ||
| Ali Montazeralghaem, Hamed Zamani, Azadeh Shakery | ||
| Pages: 765-768 | ||
| doi>10.1145/2911451.2914768 | ||
|
Full text: |
||
|
Pseudo-relevance feedback (PRF) has been proven to be an effective query expansion strategy to improve retrieval performance. Several PRF methods have so far been proposed for many retrieval models. Recent theoretical studies of PRF methods show that ...
expand
|
||
| Balancing Relevance Criteria through Multi-Objective Optimization | ||
| Joost van Doorn, Daan Odijk, Diederik M. Roijers, Maarten de Rijke | ||
| Pages: 769-772 | ||
| doi>10.1145/2911451.2914708 | ||
|
Full text: |
||
|
Offline evaluation of information retrieval systems typically focuses on a single effectiveness measure that models the utility for a typical user. Such a measure usually combines a behavior-based rank discount with a notion of document utility that ...
expand
|
||
| Build Emotion Lexicon from the Mood of Crowd via Topic-Assisted Joint Non-negative Matrix Factorization | ||
| Kaisong Song, Wei Gao, Ling Chen, Shi Feng, Daling Wang, Chengqi Zhang | ||
| Pages: 773-776 | ||
| doi>10.1145/2911451.2914759 | ||
|
Full text: |
||
|
In the research of building emotion lexicons, we witness the exploitation of crowd-sourced affective annotation given by readers of online news articles. Such approach ignores the relationship between topics and emotion expressions which are often closely ...
expand
|
||
| Burst Detection in Social Media Streams for Tracking Interest Profiles in Real Time | ||
| Cody Buntain, Jimmy Lin | ||
| Pages: 777-780 | ||
| doi>10.1145/2911451.2914733 | ||
|
Full text: |
||
|
This work presents RTTBurst, an end-to-end system for ingesting descriptions of user interest profiles and discovering new and relevant tweets based on those interest profiles using a simple model for identifying bursts in token usage. Our approach differs ...
expand
|
||
| Cluster-based Joint Matrix Factorization Hashing for Cross-Modal Retrieval | ||
| Dimitrios Rafailidis, Fabio Crestani | ||
| Pages: 781-784 | ||
| doi>10.1145/2911451.2914710 | ||
|
Full text: |
||
|
Cross-modal retrieval has been an emerging topic over the last years, as modern applications have to efficiently search for multimedia documents with different modalities. In this study, we propose a cross-modal hashing method by following a cluster-based ...
expand
|
||
| Collaborative Ranking with Social Relationships for Top-N Recommendations | ||
| Dimitrios Rafailidis, Fabio Crestani | ||
| Pages: 785-788 | ||
| doi>10.1145/2911451.2914711 | ||
|
Full text: |
||
|
Recommendation systems have gained a lot of attention because of their importance for handling the unprecedentedly large amount of available content on the Web, such as movies, music, books, etc. Although Collaborative Ranking (CR) models can produce ...
expand
|
||
| Community-based Cyberreading for Information Understanding | ||
| Zhuoren Jiang, Xiaozhong Liu, Liangcai Gao, Zhi Tang | ||
| Pages: 789-792 | ||
| doi>10.1145/2911451.2914744 | ||
|
Full text: |
||
|
Although the content in scientific publications is increasingly challenging, it is necessary to investigate another important problem, that of scientific information understanding. For this proposed problem, we investigate novel methods to assist scholars ...
expand
|
||
| Computational Creativity Based Video Recommendation | ||
| Wei Lu, Fu-lai Chung | ||
| Pages: 793-796 | ||
| doi>10.1145/2911451.2914707 | ||
|
Full text: |
||
|
Computational creativity, as an emerging domain of application, emphasizes the use of big data to automatically design new knowledge. Based on the availability of complex multi-relational data, one aspect of computational creativity is to infer unexplored ...
expand
|
||
| Controversy Detection in Wikipedia Using Collective Classification | ||
| Shiri Dori-Hacohen, David Jensen, James Allan | ||
| Pages: 797-800 | ||
| doi>10.1145/2911451.2914745 | ||
|
Full text: |
||
|
Concerns over personalization in IR have sparked an interest in detection and analysis of controversial topics. Accurate detection would enable many beneficial applications, such as alerting search users to controversy. Wikipedia's broad coverage and ...
expand
|
||
| Discovering Author Interest Evolution in Topic Modeling | ||
| Min Yang, Jincheng Mei, Fei Xu, Wenting Tu, Ziyu Lu | ||
| Pages: 801-804 | ||
| doi>10.1145/2911451.2914723 | ||
|
Full text: |
||
|
Discovering the author's interest over time from documents has important applications in recommendation systems, authorship identification and opinion extraction. In this paper, we propose an interest drift model (IDM), which monitors the evolution of ...
expand
|
||
| Distributional Random Oversampling for Imbalanced Text Classification | ||
| Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani | ||
| Pages: 805-808 | ||
| doi>10.1145/2911451.2914722 | ||
|
Full text: |
||
|
The accuracy of many classification algorithms is known to suffer when the data are imbalanced (i.e., when the distribution of the examples across the classes is severely skewed). Many applications of binary text classification are of this type, with ...
expand
|
||
| Doc2Sent2Vec: A Novel Two-Phase Approach for Learning Document Representation | ||
| Ganesh J, Manish Gupta, Vasudeva Varma | ||
| Pages: 809-812 | ||
| doi>10.1145/2911451.2914717 | ||
|
Full text: |
||
|
Doc2Sent2Vec is an unsupervised approach to learn low-dimensional feature vector (or embedding) for a document. This embedding captures the semantics of the document and can be fed as input to machine learning algorithms to solve a myriad number of applications ...
expand
|
||
| Dynamically Integrating Item Exposure with Rating Prediction in Collaborative Filtering | ||
| Ting-Yi Shih, Ting-Chang Hou, Jian-De Jiang, Yen-Chieh Lien, Chia-Rui Lin, Pu-Jen Cheng | ||
| Pages: 813-816 | ||
| doi>10.1145/2911451.2914769 | ||
|
Full text: |
||
|
The paper proposes a novel approach to appropriately promote those items with few ratings in collaborative filtering. Different from previous works, we force the items with few ratings to be promoted to the users who would potentially be able to give ...
expand
|
||
| Effective Trend Detection within a Dynamic Search Context | ||
| Anat Hashavit, Roy Levin, Ido Guy, Gilad Kutiel | ||
| Pages: 817-820 | ||
| doi>10.1145/2911451.2914705 | ||
|
Full text: |
||
|
In recent years, studies about trend detection in online social media streams have begun to emerge. Since not all users are likely to always be interested in the same set of trends, some of the research also focused on personalizing the trends by using ...
expand
|
||
| Enhancing First Story Detection using Word Embeddings | ||
| Sean Moran, Richard McCreadie, Craig Macdonald, Iadh Ounis | ||
| Pages: 821-824 | ||
| doi>10.1145/2911451.2914719 | ||
|
Full text: |
||
|
In this paper we show how word embeddings can be used to increase the effectiveness of a state-of-the art Locality Sensitive Hashing (LSH) based first story detection (FSD) system over a standard tweet corpus. Vocabulary mismatch, in which related tweets ...
expand
|
||
| Examining the Coherence of the Top Ranked Tweet Topics | ||
| Anjie Fang, Craig Macdonald, Iadh Ounis, Philip Habel | ||
| Pages: 825-828 | ||
| doi>10.1145/2911451.2914731 | ||
|
Full text: |
||
|
Topic modelling approaches help scholars to examine the topics discussed in a corpus. Due to the popularity of Twitter, two distinct methods have been proposed to accommodate the brevity of tweets: the tweet pooling method and Twitter LDA. Both of these ...
expand
|
||
| Explicit In Situ User Feedback for Web Search Results | ||
| Jin Young Kim, Jaime Teevan, Nick Craswell | ||
| Pages: 829-832 | ||
| doi>10.1145/2911451.2914754 | ||
|
Full text: |
||
|
Gathering evidence about whether a search result is relevant is a core concern in the evaluation and improvement of information retrieval systems. Two common sources of evidence for establishing relevance are judgements from trained assessors and logs ...
expand
|
||
| Exploiting CPU SIMD Extensions to Speed-up Document Scoring with Tree Ensembles | ||
| Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini | ||
| Pages: 833-836 | ||
| doi>10.1145/2911451.2914758 | ||
|
Full text: |
||
|
Scoring documents with learning-to-rank (LtR) models based on large ensembles of regression trees is currently deemed one of the best solutions to effectively rank query results to be returned by large scale Information Retrieval systems. This paper ...
expand
|
||
| Exploiting Semantic Coherence Features for Information Retrieval | ||
| Xinhui Tu, Jimmy Xiangji Huang, Jing Luo, Tingting He | ||
| Pages: 837-840 | ||
| doi>10.1145/2911451.2914691 | ||
|
Full text: |
||
|
Most of the existing information retrieval models assume that the terms of a text document are independent of each other. These retrieval models integrate three major variables to determine the degree of importance of a term for a document: within document ...
expand
|
||
| Extracting Information Seeking Intentions for Web Search Sessions | ||
| Matthew Mitsui, Chirag Shah, Nicholas J. Belkin | ||
| Pages: 841-844 | ||
| doi>10.1145/2911451.2914746 | ||
|
Full text: |
||
|
We present a method for extracting the self-reported intentions of users engaged in an information seeking episode. We recruited participants to conduct search sessions and subsequently asked them to self-report their intentions. A total of 27 users ...
expand
|
||
| First Story Detection using Multiple Nearest Neighbors | ||
| Jeroen B.P. Vuurens, Arjen P. de Vries | ||
| Pages: 845-848 | ||
| doi>10.1145/2911451.2914761 | ||
|
Full text: |
||
|
First Story Detection (FSD) systems aim to identify those news articles that discuss an event that was not reported before. Recent work on FSD has focussed almost exclusively on efficiently detecting documents that are dissimilar from their nearest neighbor. ...
expand
|
||
| Health Monitoring on Social Media over Time | ||
| Sumit Sidana, Shashwat Mishra, Sihem Amer-Yahia, Marianne Clausel, Massih-Reza Amini | ||
| Pages: 849-852 | ||
| doi>10.1145/2911451.2914697 | ||
|
Full text: |
||
|
Social media has become a major source for analyzing all aspects of daily life. Thanks to dedicated latent topic analysis methods such as the Ailment Topic Aspect Model (ATAM), public health can now be observed on Twitter. In this work, we are interested ...
expand
|
||
| How Informative is a Term?: Dispersion as a measure of Term Specificity | ||
| Rodney McDonell, Justin Zobel, Bodo Billerbeck | ||
| Pages: 853-856 | ||
| doi>10.1145/2911451.2914687 | ||
|
Full text: |
||
|
Similarity functions assign scores to documents in response to queries. These functions require as input statistics about the terms in the queries and documents, where the intention is that the statistics are estimates of the relative informativeness ...
expand
|
||
| Identifying Careless Workers in Crowdsourcing Platforms: A Game Theory Approach | ||
| Yashar Moshfeghi, Alvaro F. Huertas-Rosero, Joemon M. Jose | ||
| Pages: 857-860 | ||
| doi>10.1145/2911451.2914756 | ||
|
Full text: |
||
|
In this paper we introduce a game scenario for crowdsourcing (CS) using incentives as a bait for careless (gambler) workers, who respond to them in a characteristic way. We hypothesise that careless workers are risk-inclined and can be detected in the ...
expand
|
||
| Impact of Review-Set Selection on Human Assessment for Text Classification | ||
| Adam Roegiest, Gordon V. Cormack | ||
| Pages: 861-864 | ||
| doi>10.1145/2911451.2914709 | ||
|
Full text: |
||
|
In a laboratory study, human assessors were significantly more likely to judge the same documents as relevant when they were presented for assessment within the context of documents selected using random or uncertainty sampling, as compared to relevance ...
expand
|
||
| Improving Automated Controversy Detection on the Web | ||
| Myungha Jang, James Allan | ||
| Pages: 865-868 | ||
| doi>10.1145/2911451.2914764 | ||
|
Full text: |
||
|
Automatically detecting controversy on the Web is a useful capability for a search engine to help users review web content with a more balanced and critical view. The current state-of-the art approach is to find K-Nearest-Neighbors in Wikipedia to the ...
expand
|
||
| Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval | ||
| Qingyao Ai, Liu Yang, Jiafeng Guo, W. Bruce Croft | ||
| Pages: 869-872 | ||
| doi>10.1145/2911451.2914688 | ||
|
Full text: |
||
|
Incorporating topic level estimation into language models has been shown to be beneficial for information retrieval (IR) models such as cluster-based retrieval and LDA-based document representation. Neural embedding models, such as paragraph vector (PV) ...
expand
|
||
| Improving Retrieval Quality Using Pseudo Relevance Feedback in Content-Based Image Retrieval | ||
| Dinesha Chathurani Nanayakkara Wasam Uluwitige, Timothy Chappell, Shlomo Geva, Vinod Chandran | ||
| Pages: 873-876 | ||
| doi>10.1145/2911451.2914747 | ||
|
Full text: |
||
|
The increased availability of image capturing devices has enabled collections of digital images to rapidly expand in both size and diversity. This has created a constantly growing need for efficient and effective image browsing, searching, and retrieval ...
expand
|
||
| Ingrams: A Neuropsychological Explanation For Why People Search | ||
| Peter Bailey, Nick Craswell | ||
| Pages: 877-880 | ||
| doi>10.1145/2911451.2914712 | ||
|
Full text: |
||
|
Why do people start a search? Why do they stop? Why do they do what they do in-between? Our goal in this paper is to provide a simple yet general explanation for these acts that has its basis in neuropsychology and observed user behavior. We coin the ...
expand
|
||
| Investment Recommendation using Investor Opinions in Social Media | ||
| Wenting Tu, David W. Cheung, Nikos Mamoulis, Min Yang, Ziyu Lu | ||
| Pages: 881-884 | ||
| doi>10.1145/2911451.2914699 | ||
|
Full text: |
||
|
Investor social media, such as StockTwist, are gaining increasing popularity. These sites allow users to post their investing opinions and suggestions in the form of microblogs. Given the growth of the posted data, a significant and challenging research ...
expand
|
||
| "Is Sven Seven?": A Search Intent Module for Children | ||
| Nevena Dragovic, Ion Madrazo Azpiazu, Maria Soledad Pera | ||
| Pages: 885-888 | ||
| doi>10.1145/2911451.2914738 | ||
|
Full text: |
||
|
The Internet is the biggest data-sharing platform, comprised of an immeasurable quantity of resources covering diverse topics appealing to users of all ages. Children shape tomorrow's society, so it is essential that this audience becomes agile with ...
expand
|
||
| Is This Your Final Answer?: Evaluating the Effect of Answers on Good Abandonment in Mobile Search | ||
| Kyle Williams, Julia Kiseleva, Aidan C. Crook, Imed Zitouni, Ahmed Hassan Awadallah, Madian Khabsa | ||
| Pages: 889-892 | ||
| doi>10.1145/2911451.2914736 | ||
|
Full text: |
||
|
Answers on mobile search result pages have become a common way to attempt to satisfy users without them needing to click on search results. Many different types of answers exist, such as weather, flight and currency answers. Understanding the effect ...
expand
|
||
| Jointly Modeling Review Content and Aspect Ratings for Review Rating Prediction | ||
| Zhipeng Jin, Qiudan Li, Daniel D. Zeng, YongCheng Zhan, Ruoran Liu, Lei Wang, Hongyuan Ma | ||
| Pages: 893-896 | ||
| doi>10.1145/2911451.2914692 | ||
|
Full text: |
||
|
Review rating prediction is of much importance for sentiment analysis and business intelligence. Existing methods work well when aspect-opinion pairs can be accurately extracted from review texts and aspect ratings are complete. The challenges of improving ...
expand
|
||
| Learning to Project and Binarise for Hashing Based Approximate Nearest Neighbour Search | ||
| Sean Moran | ||
| Pages: 897-900 | ||
| doi>10.1145/2911451.2914766 | ||
|
Full text: |
||
|
In this paper we focus on improving the effectiveness of hashing-based approximate nearest neighbour search. Generating similarity preserving hashcodes for images has been shown to be an effective and efficient method for searching through large datasets. ...
expand
|
||
| Linking Organizational Social Network Profiles | ||
| Jerome Cheng, Kazunari Sugiyama, Min-Yen Kan | ||
| Pages: 901-904 | ||
| doi>10.1145/2911451.2914698 | ||
|
Full text: |
||
|
Many organizations possess social media accounts on different social networks, but these profiles are not always linked. End applications, users, as well as the organization themselves, can benefit when the profiles are appropriately identified and linked. ...
expand
|
||
| Load-Balancing in Distributed Selective Search | ||
| Yubin Kim, Jamie Callan, J. Shane Culpepper, Alistair Moffat | ||
| Pages: 905-908 | ||
| doi>10.1145/2911451.2914689 | ||
|
Full text: |
||
|
Simulation and analysis have shown that selective search can reduce the cost of large-scale distributed information retrieval. By partitioning the collection into small topical shards, and then using a resource ranking algorithm to choose a subset of ...
expand
|
||
| Multi-Rate Deep Learning for Temporal Recommendation | ||
| Yang Song, Ali Mamdouh Elkahky, Xiaodong He | ||
| Pages: 909-912 | ||
| doi>10.1145/2911451.2914726 | ||
|
Full text: |
||
|
Modeling temporal behavior in recommendation systems is an important and challenging problem. Its challenges come from the fact that temporal modeling increases the cost of parameter estimation and inference, while requiring large amount of data to reliably ...
expand
|
||
| Network-Aware Recommendations of Novel Tweets | ||
| Noor Aldeen Alawad, Aris Anagnostopoulos, Stefano Leonardi, Ida Mele, Fabrizio Silvestri | ||
| Pages: 913-916 | ||
| doi>10.1145/2911451.2914760 | ||
|
Full text: |
||
|
With the rapid proliferation of microblogging services such as Twitter, a large number of tweets is published everyday often making users feel overwhelmed with information. Helping these users to discover potentially interesting tweets is an important ...
expand
|
||
| Not All Links Are Created Equal: An Adaptive Embedding Approach for Social Personalized Ranking | ||
| Qing Zhang, Houfeng Wang | ||
| Pages: 917-920 | ||
| doi>10.1145/2911451.2914740 | ||
|
Full text: |
||
|
With a large amount of complex network data available, most existing recommendation models consider exploiting rich user social relations for better interest targeting. In these approaches, the underlying assumption is that similar users in social networks ...
expand
|
||
| On a Topic Model for Sentences | ||
| Georgios Balikas, Massih-Reza Amini, Marianne Clausel | ||
| Pages: 921-924 | ||
| doi>10.1145/2911451.2914714 | ||
|
Full text: |
||
|
Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text spans such ...
expand
|
||
| On Information-Theoretic Document-Person Associations for Expert Search in Academia | ||
| Vitor Mangaravite, Rodrygo L.T. Santos | ||
| Pages: 925-928 | ||
| doi>10.1145/2911451.2914751 | ||
|
Full text: |
||
|
State-of-the-art expert search approaches rely on document-person associations to infer the expertise of a candidate person for a given query. Such associations have traditionally been modeled as boolean variables, indicating whether or not a candidate ...
expand
|
||
| On the Applicability of Delicious for Temporal Search on Web Archives | ||
| Helge Holzmann, Wolfgang Nejdl, Avishek Anand | ||
| Pages: 929-932 | ||
| doi>10.1145/2911451.2914724 | ||
|
Full text: |
||
|
Web archives are large longitudinal collections that store webpages from the past, which might be missing on the current live Web. Consequently, temporal search over such collections is essential for finding prominent missing webpages and tasks like ...
expand
|
||
| On the Effectiveness of Contextualisation Techniques in Spoken Query Spoken Content Retrieval | ||
| David N. Racca, Gareth J.F. Jones | ||
| Pages: 933-936 | ||
| doi>10.1145/2911451.2914730 | ||
|
Full text: |
||
|
In passage and XML retrieval, contextualisation techniques seek to improve the rank of a relevant element by considering information from its surrounding elements and its container document. Recent research has demonstrated that some of these techniques ...
expand
|
||
| Ordinal Text Quantification | ||
| Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani | ||
| Pages: 937-940 | ||
| doi>10.1145/2911451.2914749 | ||
|
Full text: |
||
|
In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or "relative frequency") of each class c in a predefined ...
expand
|
||
| Pearson Rank: A Head-Weighted Gap-Sensitive Score-Based Correlation Coefficient | ||
| Ning Gao, Mossaab Bagdouri, Douglas W. Oard | ||
| Pages: 941-944 | ||
| doi>10.1145/2911451.2914728 | ||
|
Full text: |
||
|
One way of evaluating the reusability of a test collection is to determine whether removing the unique contributions of some system would alter the preference order between that system and others. Rank correlation measures such as Kendall's tau are often ...
expand
|
||
| Polarized User and Topic Tracking in Twitter | ||
| Mauro Coletto, Claudio Lucchese, Salvatore Orlando, Raffaele Perego | ||
| Pages: 945-948 | ||
| doi>10.1145/2911451.2914716 | ||
|
Full text: |
||
|
Digital traces of conversations in micro-blogging platforms and OSNs provide information about user opinion with a high degree of resolution. These information sources can be exploited to understand and monitor collective behaviours. In this work, we ...
expand
|
||
| Post-Learning Optimization of Tree Ensembles for Efficient Ranking | ||
| Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, Salvatore Trani | ||
| Pages: 949-952 | ||
| doi>10.1145/2911451.2914763 | ||
|
Full text: |
||
|
Learning to Rank (LtR) is the machine learning method of choice for producing high quality document ranking functions from a ground-truth of training examples. In practice, efficiency and effectiveness are intertwined concepts and trading off effectiveness ...
expand
|
||
| Quit While Ahead: Evaluating Truncated Rankings | ||
| Fei Liu, Alistair Moffat, Timothy Baldwin, Xiuzhen Zhang | ||
| Pages: 953-956 | ||
| doi>10.1145/2911451.2914737 | ||
|
Full text: |
||
|
Many types of search tasks are answered through the computation of a ranked list of suggested answers. We re-examine the usual assumption that answer lists should be as long as possible, and suggest that when the number of matching items is potentially ...
expand
|
||
| Quote Recommendation in Dialogue using Deep Neural Network | ||
| Hanbit Lee, Yeonchan Ahn, Haejun Lee, Seungdo Ha, Sang-goo Lee | ||
| Pages: 957-960 | ||
| doi>10.1145/2911451.2914734 | ||
|
Full text: |
||
|
Quotes, or quotations, are well known phrases or sentences that we use for various purposes such as emphasis, elaboration, and humor. In this paper, we introduce a task of recommending quotes which are suitable for given dialogue context and we present ...
expand
|
||
| Ranking Documents Through Stochastic Sampling on Bayesian Network-based Models: A Pilot Study | ||
| Xing Tan, Jimmy Xiangji Huang, Aijun An | ||
| Pages: 961-964 | ||
| doi>10.1145/2911451.2914750 | ||
|
Full text: |
||
|
Using approximate inference techniques, we investigate in this paper the applicability of Bayesian Networks to the problem of ranking a large set of documents. Topology of the network is a bipartite. Network parameters (conditional probability distributions) ...
expand
|
||
| Ranking Health Web Pages with Relevance and Understandability | ||
| Joao Palotti, Lorraine Goeuriot, Guido Zuccon, Allan Hanbury | ||
| Pages: 965-968 | ||
| doi>10.1145/2911451.2914741 | ||
|
Full text: |
||
|
We propose a method that integrates relevance and understandability to rank health web documents. We use a learning to rank approach with standard retrieval features to determine topical relevance and additional features based on readability measures ...
expand
|
||
| Rethinking the Cost of Information Search Behavior | ||
| Yinglong Zhang, Jacek Gwizdka | ||
| Pages: 969-972 | ||
| doi>10.1145/2911451.2914742 | ||
|
Full text: |
||
|
In this paper, we present a cognitive-economic approach to examining the cost in information search. Unlike previous studies on economic models, we calculated the cost in information search based on participants' eye-tracking data as well as their behavioral ...
expand
|
||
| Retrievability of Code Mixed Microblogs | ||
| Debasis Ganguly, Ayan Bandyopadhyay, Mandar Mitra, Gareth J.F. Jones | ||
| Pages: 973-976 | ||
| doi>10.1145/2911451.2914727 | ||
|
Full text: |
||
|
Mixing multiple languages within the same document, a phenomenon called (linguistic) code mixing or code switching, is a frequent trend among multilingual users of social media. In the context of information retrieval (IR), code mixing may affect retrieval ...
expand
|
||
| Retweeting Behavior Prediction Based on One-Class Collaborative Filtering in Social Networks | ||
| Bo Jiang, Jiguang Liang, Ying Sha, Rui Li, Wei Liu, Hongyuan Ma, Lihong Wang | ||
| Pages: 977-980 | ||
| doi>10.1145/2911451.2914713 | ||
|
Full text: |
||
|
Social behaviors such as retweetings, comments or likes are valuable information for human activities analysis. We focus here on user's retweeting behavior which has been considered as a key mechanism of information diffusion in social networks. Since ...
expand
|
||
| Sampling Strategies and Active Learning for Volume Estimation | ||
| Haotian Zhang, Jimmy Lin, Gordon V. Cormack, Mark D. Smucker | ||
| Pages: 981-984 | ||
| doi>10.1145/2911451.2914685 | ||
|
Full text: |
||
|
This paper tackles the challenge of accurately and efficiently estimating the number of relevant documents in a collection for a particular topic. One real-world application is estimating the volume of social media posts (e.g., tweets) pertaining to ...
expand
|
||
| Search-based Evaluation from Truth Transcripts for Voice Search Applications | ||
| François Mairesse, Paul Raccuglia, Shiv Vitaladevuni | ||
| Pages: 985-988 | ||
| doi>10.1145/2911451.2914735 | ||
|
Full text: |
||
|
Voice search applications are typically evaluated by comparing the predicted query to a reference human transcript, regardless of the search results returned by the query. While we find that an exact transcript match is highly indicative of user satisfaction, ...
expand
|
||
| Seeking Serendipity: A Living Lab Approach to Understanding Creative Retrieval in Broadcast Media Production | ||
| Sabrina Sauer, Maarten de Rijke | ||
| Pages: 989-992 | ||
| doi>10.1145/2911451.2914721 | ||
|
Full text: |
||
|
This paper presents a method to map user needs and integrate serendipitous search behaviors in search algorithm development: the living lab approach. This user-centered design approach involves technology users during technology development to catch ...
expand
|
||
| Selectively Personalizing Query Auto-Completion | ||
| Fei Cai, Maarten de Rijke | ||
| Pages: 993-996 | ||
| doi>10.1145/2911451.2914686 | ||
|
Full text: |
||
|
Query auto-completion (QAC) is being used by many of today's search engines. It helps searchers formulate queries by providing a list of query completions after entering an initial prefix of a query. To cater for a user's specific information needs, ...
expand
|
||
| SG++: Word Representation with Sentiment and Negation for Twitter Sentiment Classification | ||
| Qinmin Hu, Yijun Pei, Qin Chen, Liang He | ||
| Pages: 997-1000 | ||
| doi>10.1145/2911451.2914718 | ||
|
Full text: |
||
|
Here we propose an advance Skip-gram model to incorporate both word sentiment and negation information. In particular, there is a a softmax layer for the word sentiment polarity upon the Skip-gram model. Then, two paralleled embedding layers are set ...
expand
|
||
| SGT Framework: Social, Geographical and Temporal Relevance for Recreational Queries in Web Search | ||
| Stewart Whiting, Omar Alonso | ||
| Pages: 1001-1004 | ||
| doi>10.1145/2911451.2914743 | ||
|
Full text: |
||
|
While location-based social networks (LBSNs) have become widely used for sharing and consuming location information, a large number of users turn to general web search engines for recreational activity ideas. In these cases, users typically express a ...
expand
|
||
| SimCC-AT: A Method to Compute Similarity of Scientific Papers with Automatic Parameter Tuning | ||
| Masoud Reyhani Hamedani, Sang-Wook Kim | ||
| Pages: 1005-1008 | ||
| doi>10.1145/2911451.2914715 | ||
|
Full text: |
||
|
In this paper, we propose SimCC-AT (similarity based on content and citations with automatic parameter tuning) to compute the similarity of scientific papers. As in SimCC, the state-of-the-art method, we exploit a notion of a contribution score in similarity ...
expand
|
||
| Simple Dynamic Emission Strategies for Microblog Filtering | ||
| Luchen Tan, Adam Roegiest, Charles L.A. Clarke, Jimmy Lin | ||
| Pages: 1009-1012 | ||
| doi>10.1145/2911451.2914704 | ||
|
Full text: |
||
|
Push notifications from social media provide a method to keep up-to-date on topics of personal interest. To be effective, notifications must achieve a balance between pushing too much and pushing too little. Push too little and the user misses important ...
expand
|
||
| Subspace Clustering Based Tag Sharing for Inductive Tag Matrix Refinement with Complex Errors | ||
| Yuqing Hou, Zhouchen Lin, Jin-ge Yao | ||
| Pages: 1013-1016 | ||
| doi>10.1145/2911451.2914693 | ||
|
Full text: |
||
|
Annotating images with tags is useful for indexing and retrieving images. However, many available annotation data include missing or inaccurate annotations. In this paper, we propose an image annotation framework which sequentially performs tag completion ...
expand
|
||
| Temporal Query Intent Disambiguation using Time-Series Data | ||
| Yue Zhao, Claudia Hauff | ||
| Pages: 1017-1020 | ||
| doi>10.1145/2911451.2914767 | ||
|
Full text: |
||
|
Understanding temporal intents behind users' queries is essential to meet users' time-related information needs. In order to classify queries according to their temporal intent (e.g. Past or Future), we explore the usage of time-series data derived from ...
expand
|
||
| To Blend or Not to Blend?: Perceptual Speed, Visual Memory and Aggregated Search | ||
| Lauren Turpin, Diane Kelly, Jaime Arguello | ||
| Pages: 1021-1024 | ||
| doi>10.1145/2911451.2914739 | ||
|
Full text: |
||
|
While aggregated search interfaces that present vertical results to searchers are fairly common in today's search environments, little is known about how searchers' cognitive abilities impact how they use and evaluate these interfaces. This study evaluates ...
expand
|
||
| Topic Model based Privacy Protection in Personalized Web Search | ||
| Wasi Uddin Ahmad, Md Masudur Rahman, Hongning Wang | ||
| Pages: 1025-1028 | ||
| doi>10.1145/2911451.2914753 | ||
|
Full text: |
||
|
Modern search engines utilize users' search history for personalization, which provides more effective, useful and relevant search results. However, it also has the potential risk of revealing users' privacy by identifying their underlying intention ...
expand
|
||
| Topic Quality Metrics Based on Distributed Word Representations | ||
| Sergey I. Nikolenko | ||
| Pages: 1029-1032 | ||
| doi>10.1145/2911451.2914720 | ||
|
Full text: |
||
|
Automated evaluation of topic quality remains an important unsolved problem in topic modeling and represents a major obstacle for development and evaluation of new topic models. Previous attempts at the problem have been formulated as variations on the ...
expand
|
||
| Toward Estimating the Rank Correlation between the Test Collection Results and the True System Performance | ||
| Julián Urbano, Mónica Marrero | ||
| Pages: 1033-1036 | ||
| doi>10.1145/2911451.2914752 | ||
|
Full text: |
||
|
The Kendall ? and AP rank correlation coefficients have become mainstream in Information Retrieval research for comparing the rankings of systems produced by two different evaluation conditions, such as different effectiveness measures or pool depths. ...
expand
|
||
| Tracking Sentiment by Time Series Analysis | ||
| Anastasia Giachanou, Fabio Crestani | ||
| Pages: 1037-1040 | ||
| doi>10.1145/2911451.2914702 | ||
|
Full text: |
||
|
In recent years social media have emerged as popular platforms for people to share their thoughts and opinions on all kind of topics. Tracking opinion over time is a powerful tool that can be used for sentiment prediction or to detect the possible reasons ...
expand
|
||
| Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder | ||
| Soroush Vosoughi, Prashanth Vijayaraghavan, Deb Roy | ||
| Pages: 1041-1044 | ||
| doi>10.1145/2911451.2914762 | ||
|
Full text: |
||
|
We present Tweet2Vec, a novel method for generating general-purpose vector representation of tweets. The model learns tweet embeddings using character-level CNN-LSTM encoder-decoder. We trained our model on 3 million, randomly selected English-language ...
expand
|
||
| Two Sample T-tests for IR Evaluation: Student or Welch? | ||
| Tetsuya Sakai | ||
| Pages: 1045-1048 | ||
| doi>10.1145/2911451.2914684 | ||
|
Full text: |
||
|
There are two well-known versions of the t-test for comparing means from unpaired data: Student's t-test and Welch's t-test. While Welch's t-test does not assume homoscedasticity (i.e., equal variances), nit involves approximations. ...
expand
|
||
| Uncovering Task Based Behavioral Heterogeneities in Online Search Behavior | ||
| Rishabh Mehrotra, Prasanta Bhattacharya, Emine Yilmaz | ||
| Pages: 1049-1052 | ||
| doi>10.1145/2911451.2914755 | ||
|
Full text: |
||
|
While a major share of prior work have considered search sessions as the focal unit of analysis for seeking behavioral insights, search tasks are emerging as a competing perspective in this space. In the current work, we quantify user search task behavior ...
expand
|
||
| Understanding Website Behavior based on User Agent | ||
| Kien Pham, Aécio Santos, Juliana Freire | ||
| Pages: 1053-1056 | ||
| doi>10.1145/2911451.2914757 | ||
|
Full text: |
||
|
Web sites have adopted a variety of adversarial techniques to prevent web crawlers from retrieving their content. While it is possible to simulate users behavior using a browser to crawl such sites, this approach is not scalable. Therefore, understanding ...
expand
|
||
| Using Word Embedding to Evaluate the Coherence of Topics from Twitter Data | ||
| Anjie Fang, Craig Macdonald, Iadh Ounis, Philip Habel | ||
| Pages: 1057-1060 | ||
| doi>10.1145/2911451.2914729 | ||
|
Full text: |
||
|
Scholars often seek to understand topics discussed on Twitter using topic modelling approaches. Several coherence metrics have been proposed for evaluating the coherence of the topics generated by these approaches, including the pre-calculated Pointwise ...
expand
|
||
| Utilizing Focused Relevance Feedback | ||
| Elinor Brondwine, Anna Shtok, Oren Kurland | ||
| Pages: 1061-1064 | ||
| doi>10.1145/2911451.2914695 | ||
|
Full text: |
||
|
We present a novel study of ad hoc retrieval methods utilizing document-level relevance feedback and/or focused relevance feedback; namely, passages marked as (non-)relevant. The first method uses a novel mixture model that integrates relevant ...
expand
|
||
| What Makes a Query Temporally Sensitive? | ||
| Craig Willis, Garrick Sherman, Miles Efron | ||
| Pages: 1065-1068 | ||
| doi>10.1145/2911451.2914703 | ||
|
Full text: |
||
|
This work takes an in-depth look at the factors that affect manual classifications of 'temporally sensitive' information needs. We use qualitative and quantitative techniques to analyze 660 topics from the Text Retrieval Conference (TREC) previously ...
expand
|
||
| Which Information Sources are More Effective and Reliable in Video Search | ||
| Zhiyong Cheng, Xuanchong Li, Jialie Shen, Alexander G. Hauptmann | ||
| Pages: 1069-1072 | ||
| doi>10.1145/2911451.2914765 | ||
|
Full text: |
||
|
It is common that users are interested in finding video segments, which contain further information about the video contents in a segment of interest. To facilitate users to find and browse related video contents, video hyperlinking aims at constructing ...
expand
|
||
| Why do you Think this Query is Difficult?: A User Study on Human Query Prediction | ||
| Stefano Mizzaro, Josiane Mothe | ||
| Pages: 1073-1076 | ||
| doi>10.1145/2911451.2914696 | ||
|
Full text: |
||
|
Predicting if a query will be difficult for a system is important to improve retrieval effectiveness by implementing specific processing. There have been several attempts to predict difficulty, both automatically and manually; but without high accuracy ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations | ||
| Craig Macdonald | ||
| A Platform for Streaming Push Notifications to Mobile Assessors | ||
| Adam Roegiest, Luchen Tan, Jimmy Lin, Charles L.A. Clarke | ||
| Pages: 1077-1080 | ||
| doi>10.1145/2911451.2911463 | ||
|
Full text: |
||
|
We present an assessment platform for gathering online relevance judgments for mobile push notifications that will be deployed in the newly-created TREC 2016 Real-Time Summarization (RTS) track. There is emerging interest in building systems that filter ...
expand
|
||
| A Visual Analytics Approach for What-If Analysis of Information Retrieval Systems | ||
| Marco Angelini, Nicola Ferro, Giuseppe Santucci, Gianmaria Silvello | ||
| Pages: 1081-1084 | ||
| doi>10.1145/2911451.2911462 | ||
|
Full text: |
||
|
We present the innovative visual analytics approach of the VATE system, which eases and makes more effective the experimental evaluation process by introducing the what-if analysis. The what-if analysis is aimed at estimating the possible effects of ...
expand
|
||
| An Architecture for Privacy-Preserving and Replicable High-Recall Retrieval Experiments | ||
| Adam Roegiest, Gordon V. Cormack | ||
| Pages: 1085-1088 | ||
| doi>10.1145/2911451.2911456 | ||
|
Full text: |
||
|
We demonstrate the infrastructure used in the TREC 2015 Total Recall track to facilitate controlled simulation of "assessor in the loop" high-recall retrieval experimentation. The implementation and corresponding design decisions are presented for this ...
expand
|
||
| Analysing Temporal Evolution of Interlingual Wikipedia Article Pairs | ||
| Simon Gottschalk, Elena Demidova | ||
| Pages: 1089-1092 | ||
| doi>10.1145/2911451.2911472 | ||
|
Full text: |
||
|
Wikipedia articles representing an entity or a topic in different language editions evolve independently within the scope of the language-specific user communities. This can lead to different points of views reflected in the articles, as well as complementary ...
expand
|
||
| Cobwebs from the Past and Present: Extracting Large Social Networks using Internet Archive Data | ||
| Miroslav Shaltev, Jan-Hendrik Zab, Philipp Kemkes, Stefan Siersdorfer, Sergej Zerr | ||
| Pages: 1093-1096 | ||
| doi>10.1145/2911451.2911467 | ||
|
Full text: |
||
|
Social graph construction from various sources has been of interest to researchers due to its application potential and the broad range of technical challenges involved. The World Wide Web provides a huge amount of continuously updated data and information ...
expand
|
||
| Context-Sensitive Auto-Completion for Searching with Entities and Categories | ||
| Andreas Schmidt, Johannes Hoffart, Dragan Milchevski, Gerhard Weikum | ||
| Pages: 1097-1100 | ||
| doi>10.1145/2911451.2911461 | ||
|
Full text: |
||
|
When searching in a document collection by keywords, good auto-completion suggestions can be derived from query logs and corpus statistics. On the other hand, when querying documents which have automatically been linked to entities and semantic categories, ...
expand
|
||
| EAIMS: Emergency Analysis Identification and Management System | ||
| Richard McCreadie, Craig Macdonald, Iadh Ounis | ||
| Pages: 1101-1104 | ||
| doi>10.1145/2911451.2911460 | ||
|
Full text: |
||
|
Social media has great potential as a means to enable civil protection and law enforcement agencies to more effectively tackle disasters and emergencies. However, there is currently a lack of tools that enable civil protection agencies to easily make ...
expand
|
||
| Expedition: A Time-Aware Exploratory Search System Designed for Scholars | ||
| Jaspreet Singh, Wolfgang Nejdl, Avishek Anand | ||
| Pages: 1105-1108 | ||
| doi>10.1145/2911451.2911465 | ||
|
Full text: |
||
|
Archives are an important source of study for various scholars. Digitization and the web have made archives more accessible and led to the development of several time-aware exploratory search systems. However these systems have been designed for more ...
expand
|
||
| iGlasses: A Novel Recommendation System for Best-fit Glasses | ||
| Xiaoling Gu, Lidan Shou, Pai Peng, Ke Chen, Sai Wu, Gang Chen | ||
| Pages: 1109-1112 | ||
| doi>10.1145/2911451.2911453 | ||
|
Full text: |
||
|
We demonstrate iGlasses, a novel recommendation system that accepts a frontal face photo as the input and returns the best-fit eyeglasses as the output. As conventional recommendation techniques such as collaborative filtering become inapplicable ...
expand
|
||
| InfoScout: An Interactive, Entity Centric, Person Search Tool | ||
| Sean McKeown, Martynas Buivys, Leif Azzopardi | ||
| Pages: 1113-1116 | ||
| doi>10.1145/2911451.2911468 | ||
|
Full text: |
||
|
Individuals living in highly networked societies publish a large amount of personal, and potentially sensitive, information online. Web investigators can exploit such information for a variety of purposes, such as in background vetting and fraud detection. ...
expand
|
||
| InLook: Revisiting Email Search Experience | ||
| Pranav Ramarao, Suresh Iyengar, Pushkar Chitnis, Raghavendra Udupa, Balasubramanyan Ashok | ||
| Pages: 1117-1120 | ||
| doi>10.1145/2911451.2911458 | ||
|
Full text: |
||
|
Emails continue to remain the most important and widely used mode of online communication despite having its origins in the middle of last century and being threatened by a variety of online communication innovations. While several studies have predicted ...
expand
|
||
| Interacting with Financial Data using Natural Language | ||
| Vassilis Plachouras, Charese Smiley, Hiroko Bretz, Ola Taylor, Jochen L. Leidner, Dezhao Song, Frank Schilder | ||
| Pages: 1121-1124 | ||
| doi>10.1145/2911451.2911457 | ||
|
Full text: |
||
|
Financial and economic data are typically available in the form of tables and comprise mostly of monetary amounts, numeric and other domain-specific fields. They can be very hard to search and they are often made available out of context, or in forms ...
expand
|
||
| LONLIES: Estimating Property Values for Long Tail Entities | ||
| Mina Farid, Ihab F. Ilyas, Steven Euijong Whang, Cong Yu | ||
| Pages: 1125-1128 | ||
| doi>10.1145/2911451.2911466 | ||
|
Full text: |
||
|
Web search engines often retrieve answers for queries about popular entities from a growing knowledge base that is populated by a continuous information extraction process. However, less popular entities are not frequently mentioned on the web and are ...
expand
|
||
| Personalised News and Blog Recommendations based on User Location, Facebook and Twitter User Profiling | ||
| Gabriella Kazai, Iskander Yusof, Daoud Clarke | ||
| Pages: 1129-1132 | ||
| doi>10.1145/2911451.2911464 | ||
|
Full text: |
||
|
This demo presents a prototype mobile app that provides out-of-the-box personalised content recommendations to its users by leveraging and combining the user's location, their Facebook and/or Twitter feed and their in-app actions to automatically infer ...
expand
|
||
| PULP: A System for Exploratory Search of Scientific Literature | ||
| Alan Medlar, Kalle Ilves, Ping Wang, Wray Buntine, Dorota Glowacka | ||
| Pages: 1133-1136 | ||
| doi>10.1145/2911451.2911455 | ||
|
Full text: |
||
|
Despite the growing importance of exploratory search, information retrieval (IR) systems tend to focus on lookup search. Lookup searches are well served by optimising the precision and recall of search results, however, for exploratory search this may ...
expand
|
||
| SECC: A Novel Search Engine Interface with Live Chat Channel | ||
| Cheng Zhang, Peng Zhang, Jingfei Li, Dawei Song | ||
| Pages: 1137-1140 | ||
| doi>10.1145/2911451.2911454 | ||
|
Full text: |
||
|
Traditional information retrieval systems rank documents according to their relevance to users' input queries. State of the art commercial search engines (SEs) train ranking models and suggest query refinements by exploiting collective intelligence implicitly ...
expand
|
||
| Simulating Interactive Information Retrieval: SimIIR: A Framework for the Simulation of Interaction | ||
| David Maxwell, Leif Azzopardi | ||
| Pages: 1141-1144 | ||
| doi>10.1145/2911451.2911469 | ||
|
Full text: |
||
|
Simulation provides a powerful and cost-effective approach to explore and evaluate how interactions between a searcher and system influence search behaviour and performance. With a growing interest in simulation and an increasing number of papers using ...
expand
|
||
| The ComeWithMe System for Searching and Ranking Activity-Based Carpooling Rides | ||
| Vinicius Monteiro de Lira, Chiara Renso, Raffaele Perego, Salvatore Rinzivillo, Valeria Cesario Times | ||
| Pages: 1145-1148 | ||
| doi>10.1145/2911451.2911459 | ||
|
Full text: |
||
|
ComeWithMe is an activity oriented carpooling service that enlarges the candidate destinations of a ride request by considering alternative places where the desired activity can be performed. It is based on the observation that individuals often move ...
expand
|
||
| ThingSeek: A Crawler and Search Engine for the Internet of Things | ||
| Ali Shemshadi, Quan Z. Sheng, Yongrui Qin | ||
| Pages: 1149-1152 | ||
| doi>10.1145/2911451.2911471 | ||
|
Full text: |
||
|
The rapidly growing paradigm of the Internet of Things (IoT) requires new search engines, which can crawl heterogeneous data sources and search in highly dynamic contexts. Existing search engines cannot meet these requirements as they are designed for ...
expand
|
||
| Tweetviz: Visualizing Tweets for Business Intelligence | ||
| Bas Sijtsma, Pernilla Qvarfordt, Francine Chen | ||
| Pages: 1153-1156 | ||
| doi>10.1145/2911451.2911470 | ||
|
Full text: |
||
|
Social media offers potential opportunities for businesses to extract business intelligence. This paper presents Tweetviz, an interactive tool to help businesses extract actionable information from a large set of noisy Twitter messages. Tweetviz visualizes ...
expand
|
||
| Where the Event Lies: Predicting Event Occurrence in Textual Documents | ||
| Andrea Ceroni, Ujwal Gadiraju, Jan Matschke, Simon Wingert, Marco Fisichella | ||
| Pages: 1157-1160 | ||
| doi>10.1145/2911451.2911452 | ||
|
Full text: |
||
|
Manually inspecting text in a document collection to assess whether an event occurs in it is a cumbersome task. Although a manual inspection can allow one to identify and discard false events, it becomes infeasible with increasing numbers of automatically ...
expand
|
||
| SESSION: Doctoral Consortium | ||
| A Novel Approach to Define and Model Contextual Features in Recommender Systems | ||
| Parisa Lak | ||
| Pages: 1161-1161 | ||
| doi>10.1145/2911451.2911481 | ||
|
Full text: |
||
|
Recommender Systems(RS) provide more accurate and more relevant recommendations using contextual feature(s). This accuracy improvement is at the cost of computational expenses. Therefore, finding and selecting the most relevant contextual features is ...
expand
|
||
| A Study of Information Seeking Behavior Using Physical and Online Explorations | ||
| Dongho Choi | ||
| Pages: 1163-1163 | ||
| doi>10.1145/2911451.2911482 | ||
|
Full text: |
||
|
People have their behavioral patterns, through which they determine how to seek and use information. People also exhibit established mobility pattern in their everyday lives. Meanwhile, the modern technologies such as smartphones, wearable devices, and ...
expand
|
||
| Appearance-Based Retrieval of Mathematical Notation in Documents and Lecture Videos | ||
| Kenny Davila | ||
| Pages: 1165-1165 | ||
| doi>10.1145/2911451.2911477 | ||
|
Full text: |
||
|
Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. Based on the notion that visually similar formulas are related, we propose a framework ...
expand
|
||
| Beyond Topical Relevance: Studying Understandability and Reliability in Consumer Health Search | ||
| Joao Palotti | ||
| Pages: 1167-1167 | ||
| doi>10.1145/2911451.2911480 | ||
|
Full text: |
||
|
Nowadays people rely on search engines to explore, understand and manage their health. A recent study from Pew Internet states that one in each three adult American Internet users have used the Internet as a diagnosis tool. Retrieving incorrect or unclear ...
expand
|
||
| Enhancing Information Retrieval with Adapted Word Embedding | ||
| Navid Rekabsaz | ||
| Pages: 1169-1169 | ||
| doi>10.1145/2911451.2911475 | ||
|
Full text: |
||
|
Recent developments on word embedding provide a novel source of information for term-to-term similarity. A recurring question now is whether the provided term associations can be properly integrated in the traditional information retrieval models while ...
expand
|
||
| Fairness in Information Retrieval | ||
| Aldo Lipani | ||
| Pages: 1171-1171 | ||
| doi>10.1145/2911451.2911473 | ||
|
Full text: |
||
|
The offline evaluation of Information Retrieval (IR) systems is performed through the use of test collections. A test collection, in its essence, is composed of: a collection of documents, a set of topics and, a set of relevance assessments for each ...
expand
|
||
| Going Beyond Relevance: Incorporating Effort in Information Retrieval | ||
| Manisha Verma | ||
| Pages: 1173-1173 | ||
| doi>10.1145/2911451.2911487 | ||
|
Full text: |
||
|
Primary focus of Information retrieval (IR) systems has been to optimizefor Relevance. Existing approaches used to rank documents or evaluate IR systems do not account for "user effort". At present, relevance captures topical overlap between document ...
expand
|
||
| Measuring Interestingness of Political Documents | ||
| Hosein Azarbonyad | ||
| Pages: 1175-1175 | ||
| doi>10.1145/2911451.2911485 | ||
|
Full text: |
||
|
Political texts are pervasive on the Web covering laws and policies in national and supranational jurisdictions. Access to this data is crucial for government transparency and accountability to the population. The main aim of our research is developing ...
expand
|
||
| Modeling User Feedback in Dynamic Search and Browsing | ||
| Jiyun Luo | ||
| Pages: 1177-1177 | ||
| doi>10.1145/2911451.2911483 | ||
|
Full text: |
||
|
Nowadays searching for complicated information needs becomes more and more common. These complicated needs usually require the users to reform different queries and conduct multiple retrievals in a search session. There are a lot of technologies are ...
expand
|
||
| Modelling User Search Behaviour Based on Process | ||
| Mengdie Zhuang | ||
| Pages: 1179-1179 | ||
| doi>10.1145/2911451.2911486 | ||
|
Full text: |
||
|
Typically, interactive information retrieval (IIR) system evaluations assess search processes and outcomes using a combination of two types of measures: 1. user perception (e.g. users? attitudes of the search experience and outcome); 2. user behaviour ...
expand
|
||
| Retrievability: An Independent Evaluation Measure | ||
| Colin Wilkie | ||
| Pages: 1181-1181 | ||
| doi>10.1145/2911451.2911478 | ||
|
Full text: |
||
|
Information Retrieval systems have traditionally been evaluated in terms of efficiency and performance. These aspects of retrieval systems, whilst very important, do not cover a crucial aspect of the system, the access it provides to the documents of ...
expand
|
||
| Significant Words Representations of Entities | ||
| Mostafa Dehghani | ||
| Pages: 1183-1183 | ||
| doi>10.1145/2911451.2911474 | ||
|
Full text: |
||
|
Transforming the data into a suitable representation is the first key step of data analysis, and the performance of any data oriented method is heavily depending on it. We study questions on how we can best learn representations for textual entities ...
expand
|
||
| Time-Quality Trade-offs in Search | ||
| Ryan Burton | ||
| Pages: 1185-1185 | ||
| doi>10.1145/2911451.2911484 | ||
|
Full text: |
||
|
In this paper, I propose a research agenda surrounding the notion of slow search, where retrieval speed may be traded for improvements in result quality. This time-quality trade- off leads to a number of implications in the areas of human- computer interaction ...
expand
|
||
| Torii: Attribute-based Polarity Analysis with Big Datasets | ||
| Fernando O. Gallego | ||
| Pages: 1187-1187 | ||
| doi>10.1145/2911451.2911479 | ||
|
Full text: |
||
|
Polarity analysis has become a key aspect of market analysis. The number of companies that are interested in the general opinion of the crowd regarding the items that they sell is increasing everyday. Attribute-based polarity analysis is a fine-grained ...
expand
|
||
| User Interaction in Mobile Web Search | ||
| Jaewon Kim | ||
| Pages: 1189-1189 | ||
| doi>10.1145/2911451.2911476 | ||
|
Full text: |
||
|
From previous studies, we believe that search behaviour on touch-enabled mobile devices is different from the behaviour with desktop screens. In the proposed research, we intend to explore user interaction while searching with the aim of improving search ...
expand
|
||
| TUTORIAL SESSION: Tutorials | ||
| Collaborative Information Seeking: Art and Science of Achieving 1+1>2 in IR | ||
| Chirag Shah | ||
| Pages: 1191-1194 | ||
| doi>10.1145/2911451.2914801 | ||
|
Full text: |
||
|
Traditional IR techniques, systems, and methods that assume an individual searcher are often shown to be inadequate for addressing search problems that are multi-faceted and/or too complex or difficult for individuals. The next big leap in information ...
expand
|
||
| Constructing and Mining Web-scale Knowledge Graphs | ||
| Evgeniy Gabrilovich, Nicolas Usunier | ||
| Pages: 1195-1197 | ||
| doi>10.1145/2911451.2914807 | ||
|
Full text: |
||
|
Recent years have witnessed a proliferation of large-scale knowledge graphs, from purely academic projects such as YAGO to major commercial projects such as Google's Knowledge Graph and Microsoft's Satori. Whereas there is a large body of research on ...
expand
|
||
| Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement | ||
| Thorsten Joachims, Adith Swaminathan | ||
| Pages: 1199-1201 | ||
| doi>10.1145/2911451.2914803 | ||
|
Full text: |
||
|
Online metrics measured through A/B tests have become the gold standard for many evaluation questions. But can we get the same results as A/B tests without actually fielding a new system? And can we train systems to optimize online metrics without subjecting ...
expand
|
||
| Deep Learning for Information Retrieval | ||
| Hang Li, Zhengdong Lu | ||
| Pages: 1203-1206 | ||
| doi>10.1145/2911451.2914800 | ||
|
Full text: |
||
|
Recent years have observed a significant progress in information retrieval and natural language processing with deep learning technologies being successfully applied into almost all of their major tasks. The key to the success of deep learning is its ...
expand
|
||
| From Design to Analysis: Conducting Controlled Laboratory Experiments with Users | ||
| Diane Kelly, Anita Crescenzi | ||
| Pages: 1207-1210 | ||
| doi>10.1145/2911451.2914809 | ||
|
Full text: |
||
|
This full-day tutorial provides general instruction about the design of controlled laboratory experiments that are conducted in order to better understand human information interaction and retrieval. Different data collection methods and procedures are ...
expand
|
||
| Instant Search: A Hands-on Tutorial | ||
| Ganesh Venkataraman, Abhimanyu Lad, Viet Ha-Thuc, Dhruv Arya | ||
| Pages: 1211-1214 | ||
| doi>10.1145/2911451.2914806 | ||
|
Full text: |
||
|
Instant search has become a common part of the search experience in most popular search engines and social networking websites. The goal is to provide instant feedback to the user in terms of query completions ("instant suggestions") or directly provide ...
expand
|
||
| Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial | ||
| Artem Grotov, Maarten de Rijke | ||
| Pages: 1215-1218 | ||
| doi>10.1145/2911451.2914798 | ||
|
Full text: |
||
|
During the past 10--15 years offline learning to rank has had a tremendous influence on information retrieval, both scientifically and in practice. Recently, as the limitations of offline learning to rank for information retrieval have become apparent, ...
expand
|
||
| Question Answering with Knowledge Base, Web and Beyond | ||
| Wen-tau Yih, Hao Ma | ||
| Pages: 1219-1221 | ||
| doi>10.1145/2911451.2914804 | ||
|
Full text: |
||
|
In this tutorial, we give the audience a coherent overview of the research of question answering (QA). We first introduce a variety of QA problems proposed by pioneer researchers and briefly describe the early efforts. By contrasting with the current ...
expand
|
||
| Scalability and Efficiency Challenges in Large-Scale Web Search Engines | ||
| B. Barla Cambazoglu, Ricardo Baeza-Yates | ||
| Pages: 1223-1226 | ||
| doi>10.1145/2911451.2914808 | ||
|
Full text: |
||
|
Commercial web search engines need to process thousands of queries every second and provide responses to user queries within a few hundred milliseconds. As a consequence of these tight performance constraints, search engines construct and maintain very ...
expand
|
||
| Simulation of Interaction: A Tutorial on Modelling and Simulating User Interaction and Search Behaviour | ||
| Leif Azzopardi | ||
| Pages: 1227-1230 | ||
| doi>10.1145/2911451.2914799 | ||
|
Full text: |
||
|
Search is an inherently interactive, non-deterministic and user-dependent process. This means that there are many different possible sequences of interactions which could be taken (some ending in success and others ending in failure). Simulation provides ...
expand
|
||
| Succinct Data Structures in Information Retrieval: Theory and Practice | ||
| Simon Gog, Rossano Venturini | ||
| Pages: 1231-1233 | ||
| doi>10.1145/2911451.2914802 | ||
|
Full text: |
||
|
Succinct data structures are used today in many information retrieval applications, e.g., posting lists representation, language model representation, indexing (social) graphs, query auto-completion, document retrieval and indexing dictionary of strings, ...
expand
|
||
| Temporal Information Retrieval | ||
| Nattiya Kanhabua, Avishek Anand | ||
| Pages: 1235-1238 | ||
| doi>10.1145/2911451.2914805 | ||
|
Full text: |
||
|
The study of temporal dynamics and its impact can be framed within the so-called temporal IR approaches, which explain how user behavior, document content and scale vary with time, and how we can use them in our favor in order to improve retrieval effectiveness. ...
expand
|
||
| WORKSHOP SESSION: Workshops | ||
| Third International Workshop on Gamification for Information Retrieval (GamifIR'16) | ||
| Michael Meder, Frank Hopfgartner, Gabriella Kazai, Udo Kruschwitz | ||
| Pages: 1239-1240 | ||
| doi>10.1145/2911451.2917759 | ||
|
Full text: |
||
|
Stronger engagement and greater participation is often crucial to reach a goal or to solve an issue. Issues like the emerging employee engagement crisis, insufficient knowledge sharing, and chronic procrastination. In many cases we need and search for ...
expand
|
||
| HIA'16: The 2nd International Workshop on Heterogeneous Information Access at SIGIR 2016 | ||
| Ke Zhou, Yiqun Liu, Roger Jie Luo, Joemon Jose | ||
| Pages: 1241-1241 | ||
| doi>10.1145/2911451.2917760 | ||
|
Full text: |
||
|
Information access is becoming increasingly heterogeneous. Especially when the user's information need is for exploratory purpose, returning a set of diverse results from different resources could benefit the user. For example, when a user is planning ...
expand
|
||
| Medical Information Search Workshop (MEDIR) | ||
| Steven Bedrick, Lorraine Goeuriot, Gareth J.F. Jones, Anastasia Krithara, Henning Mueller, George Paliouras | ||
| Pages: 1243-1243 | ||
| doi>10.1145/2911451.2917761 | ||
|
Full text: |
||
| Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval | ||
| Nick Craswell, W. Bruce Croft, Jiafeng Guo, Bhaskar Mitra, Maarten de Rijke | ||
| Pages: 1245-1246 | ||
| doi>10.1145/2911451.2917762 | ||
|
Full text: |
||
|
In recent years, deep neural networks have yielded significant performance improvements on speech recognition and computer vision tasks, as well as led to exciting breakthroughs in novel application areas such as automatic voice translation, image captioning, ...
expand
|
||
| Privacy-Preserving IR 2016: Differential Privacy, Search, and Social Media | ||
| Hui Yang, Ian Soboroff, Li Xiong, Charles L.A. Clarke, Simson L. Garfinkel | ||
| Pages: 1247-1248 | ||
| doi>10.1145/2911451.2917763 | ||
|
Full text: |
||
|
Due to lack of mature techniques in privacy-preserving information retrieval (IR), concerns about information privacy and security have become serious obstacles that prevent valuable user data to be used in IR research such as studies on query logs, ...
expand
|
||
| Search as Learning (SAL) Workshop 2016 | ||
| Jacek Gwizdka, Preben Hansen, Claudia Hauff, Jiyin He, Noriko Kando | ||
| Pages: 1249-1250 | ||
| doi>10.1145/2911451.2917766 | ||
|
Full text: |
||
|
The "Search as Learning" (SAL) workshop is focused on an area within the information retrieval field that is only beginning to emerge: supporting users in their learning whilst interacting with information content.
expand
|
||
| SIGIR 2016 Workshop WebQA II: Web Question Answering Beyond Factoids | ||
| Alessandro Moschitti, Lluiís Márquez, Preslav Nakov, Eugene Agichtein, Charles Clarke, Idan Szpektor | ||
| Pages: 1251-1252 | ||
| doi>10.1145/2911451.2917767 | ||
|
Full text: |
||
|
Web search engines have made great progress at answering factoid queries. However, they are not well-tailored for managing more complex questions, especially when they require explanation and/or description. The WebQA workshop series aims at exploring ...
expand
|
||
Welcome to SIGIR 2016, the 39th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, the premier conference in the area. We are grateful to all those who chose to submit their research papers to SIGIR 2016 and gave the Program Committee an opportunity to evaluate their work for potential inclusion in the program. We are also grateful to the 61 Senior Program Committee Members, 244 Program Committee Members, and many additional reviewers for providing significant time and effort to selecting this year's conference program. This pool of committed SIGIR volunteers comes from 24 countries and over 180 institutions.
The PC reviewed 341 papers for the full paper track, and accepted 62 with an acceptance rate of 18%. The top countries in terms of accepted papers (taking all author affiliations of each paper equally into account) were the U.S.A. (34%), and China (23%) with the most successful countries (as measured by how likely authors from that country were to have their papers accepted) were Russia (55% of authors had their paper accepted), Israel (50%), and Germany (45%). With 1236 authors overall, 313 were from the U.S.A and 283 from China; and 289 from the next six most represented countries combined.
As has been customary for many years, SIGIR 2016 employed a two-tier double-blind review process. At least three reviewers reviewed each paper, and then the Primary Area Chair of the paper led a discussion, which was used as the basis of the meta-review. The Secondary Area Chair assigned to each paper double-checked the reviews and discussion and, in some cases, provided an additional review. The PC chairs carefully examined the reviews and associated discussion, asking for additional reviews where necessary to provide a fuller discussion or additional expert opinion.
The SIGIR 2016 PC meeting operated as a virtual meeting, held over several days, in order tofacilitate the involvement of as many Senior PC members as possible. Prior to the PC meeting the PC chairs proposed a list of "clear accepts" and "clear rejects" where the meta-reviews and reviews clearly indicated a decision. Over the four days of the PC meeting, the chairs indicated which undecided papers were open for discussion, led the discussion on these papers, and made the final decision when discussions moved towards completion; overall nearly half of the submitted papers were reviewed by the chairs or at the meeting. We are grateful for the involvement of the Senior PC members in supporting this PC meeting, especially those very committed members who participated in almost every discussion. The virtual PC meeting was supplemented by a small physical meeting of the Senior PC members who attended the ECIR 2016 conference, which took place during the same week.
The short paper track received 339 submissions and accepted 104 papers for what promises to be a strong short paper track. Thanks are due to Ben Carterette, Carlos Castillo, and Jaana Kekäläinen, the track chairs, for arranging and managing a thorough review process for so many papers -- a huge effort for which we are very grateful.
The tutorial track reviewed 20 submissions and accepted 2 full-day tutorials and 10 half-day tutorials. We are grateful to Evangelos Kanoulas and Tie-Yan Liu for leading the selection of a diverse and high quality tutorial program.
The workshops track received 12 submissions and selected 7 workshop proposals. We thank Aristides Gionis and Edie Rasmussen for their efforts in producing a great set of workshops.
The demos track received 35 submissions of which 21 were accepted. Many thanks are due to Craig Macdonald for chairing this track and providing us with an array of interesting demos.
Brian D. Davison and Kira Radinsky organised the Doctoral Consortium, a vitally important part of the SIGIR program. Thanks to their efforts, the event received 23 submissions from doctoral students and accepted 15 of these for presentation and discussion at the consortium.
This year's industry track is the SIGIR Symposium on IR in Practice (SIRIP 2016) co-chaired by Jussi Karlgren and Gilad Mishne. Their efforts in organizing SIRIP and producing an interesting set of speakers from diverse backgrounds are gratefully acknowledged.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
|
Tools and Resources
Share: |
|||||||||||||||||||||||||
| SESSION: Salton Award | ||
| Charlie Clarke | ||
| Salton Award Lecture: People, Interacting with Information | ||
| Nicholas J. Belkin | ||
| Pages: 1-2 | ||
| doi>10.1145/2766462.2767854 | ||
|
Full text: |
||
|
Colleagues, friends, let me begin by expressing how pleased, and humbly honored I am to be a recipient of the Gerard Salton Award. Gerry was a great man, and to receive the award named for him is very special. For me personally, it is especially meaningful, ...
expand
|
||
| SESSION: Session 1A: Assisting the Search | ||
| Ellen Voorhes | ||
| Exploring Session Context using Distributed Representations of Queries and Reformulations | ||
| Bhaskar Mitra | ||
| Pages: 3-12 | ||
| doi>10.1145/2766462.2767702 | ||
|
Full text: |
||
|
Search logs contain examples of frequently occurring patterns of user reformulations of queries. Intuitively, the reformulation "San Francisco" -- "San Francisco 49ers" is semantically similar to "Detroit" -- "Detroit Lions". Likewise, "London" -- "things ...
expand
|
||
|
|
An Eye-Tracking Study of Query Reformulation | |
| Carsten Eickhoff, Sebastian Dungs, Vu Tran | ||
| Pages: 13-22 | ||
| doi>10.1145/2766462.2767703 | ||
|
Full text: |
||
|
Information about a user's domain knowledge and interest can be important signals for many information retrieval tasks such as query suggestion or result ranking. State-of-the-art user models rely on coarse-grained representations of the user's previous ...
expand
|
||
| Differences in the Use of Search Assistance for Tasks of Varying Complexity | ||
| Robert Capra, Jaime Arguello, Anita Crescenzi, Emily Vardell | ||
| Pages: 23-32 | ||
| doi>10.1145/2766462.2767741 | ||
|
Full text: |
||
|
In this paper, we study how users interact with a search assistance tool while completing tasks of varying complexity. We designed a novel tool referred to as the search guide (SG) that displays the search trails (queries issued, results clicked, pages ...
expand
|
||
| SESSION: Session 1B: Multimedia | ||
| Doug Oard | ||
| Dynamic Query Modeling for Related Content Finding | ||
| Daan Odijk, Edgar Meij, Isaac Sijaranamual, Maarten de Rijke | ||
| Pages: 33-42 | ||
| doi>10.1145/2766462.2767715 | ||
|
Full text: |
||
|
While watching television, people increasingly consume additional content related to what they are watching. We consider the task of finding video content related to a live television broadcast for which we leverage the textual stream of subtitles associated ...
expand
|
||
| Image-Based Recommendations on Styles and Substitutes | ||
| Julian McAuley, Christopher Targett, Qinfeng Shi, Anton van den Hengel | ||
| Pages: 43-52 | ||
| doi>10.1145/2766462.2767755 | ||
|
Full text: |
||
|
Humans inevitably develop a sense of the relationships between objects, some of which are based on their appearance. Some pairs of objects might be seen as being alternatives to each other (such as two pairs of jeans), while others may be seen as being ...
expand
|
||
| Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search | ||
| Yingwei Pan, Ting Yao, Houqiang Li, Chong-Wah Ngo, Tao Mei | ||
| Pages: 53-62 | ||
| doi>10.1145/2766462.2767725 | ||
|
Full text: |
||
|
Similarity search is one of the fundamental problems for large scale multimedia applications. Hashing techniques, as one popular strategy, have been intensively investigated owing to the speed and memory efficiency. Recent research has shown that leveraging ...
expand
|
||
| SESSION: Session 1C: Efficient Algorithms | ||
| Andrew Trotman | ||
| Optimal Aggregation Policy for Reducing Tail Latency of Web Search | ||
| Jeong-Min Yun, Yuxiong He, Sameh Elnikety, Shaolei Ren | ||
| Pages: 63-72 | ||
| doi>10.1145/2766462.2767708 | ||
|
Full text: |
||
|
A web search engine often employs partition-aggregate architecture, where an aggregator propagates a user query to all index serving nodes (ISNs) and collects the responses from them. An aggregation policy determines how long the aggregators wait for ...
expand
|
||
|
|
QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees | |
| Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini | ||
| Pages: 73-82 | ||
| doi>10.1145/2766462.2767733 | ||
|
Full text: |
||
|
Learning-to-Rank models based on additive ensembles of regression trees have proven to be very effective for ranking query results returned by Web search engines, a scenario where quality and efficiency requirements are very demanding. Unfortunately, ...
expand
|
||
| High Quality Graph-Based Similarity Search | ||
| Weiren Yu, Julie Ann McCann | ||
| Pages: 83-92 | ||
| doi>10.1145/2766462.2767720 | ||
|
Full text: |
||
|
SimRank is an influential link-based similarity measure that has been used in many fields of Web search and sociometry. The best-of-breed method by Kusumoto et. al., however, does not always deliver high-quality results, since it fails to accurately ...
expand
|
||
| SESSION: Session 2A: Diversity and Bias | ||
| Gareth Jones | ||
| Summarizing Contrastive Themes via Hierarchical Non-Parametric Processes | ||
| Zhaochun Ren, Maarten de Rijke | ||
| Pages: 93-102 | ||
| doi>10.1145/2766462.2767713 | ||
|
Full text: |
||
|
Given a topic of interest, a contrastive theme is a group of opposing pairs of viewpoints. We address the task of summarizing contrastive themes: given a set of opinionated documents, select meaningful sentences to represent contrastive themes present ...
expand
|
||
| Splitting Water: Precision and Anti-Precision to Reduce Pool Bias | ||
| Aldo Lipani, Mihai Lupu, Allan Hanbury | ||
| Pages: 103-112 | ||
| doi>10.1145/2766462.2767749 | ||
|
Full text: |
||
|
For many tasks in evaluation campaigns, especially those modeling narrow domain-specific challenges, lack of participation leads to a potential pooling bias due to the scarce number of pooled runs. It is well known that the reliability of a test collection ...
expand
|
||
| Learning Maximal Marginal Relevance Model via Directly Optimizing Diversity Evaluation Measures | ||
| Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng | ||
| Pages: 113-122 | ||
| doi>10.1145/2766462.2767710 | ||
|
Full text: |
||
|
In this paper we address the issue of learning a ranking model for search result diversification. In the task, a model concerns with both query-document relevance and document diversity is automatically created with training data. Ideally a diverse ranking ...
expand
|
||
| SESSION: Session 1B: Queries | ||
| Milad Shokouhi | ||
| Analyzing User's Sequential Behavior in Query Auto-Completion via Markov Processes | ||
| Liangda Li, Hongbo Deng, Anlei Dong, Yi Chang, Hongyuan Zha, Ricardo Baeza-Yates | ||
| Pages: 123-132 | ||
| doi>10.1145/2766462.2767723 | ||
|
Full text: |
||
|
Query auto-completion (QAC) plays an important role in assisting users typing less while submitting a query. The QAC engine generally offers a list of suggested queries that start with a user's input as a prefix, and the list of suggestions is changed ...
expand
|
||
| Learning by Example: Training Users with High-quality Query Suggestions | ||
| Morgan Harvey, Claudia Hauff, David Elsweiler | ||
| Pages: 133-142 | ||
| doi>10.1145/2766462.2767731 | ||
|
Full text: |
||
|
The queries submitted by users to search engines often poorly describe their information needs and represent a potential bottleneck in the system. In this paper we investigate to what extent it is possible to aid users in learning how to formulate better ...
expand
|
||
| adaQAC: Adaptive Query Auto-Completion via Implicit Negative Feedback | ||
| Aston Zhang, Amit Goyal, Weize Kong, Hongbo Deng, Anlei Dong, Yi Chang, Carl A. Gunter, Jiawei Han | ||
| Pages: 143-152 | ||
| doi>10.1145/2766462.2767697 | ||
|
Full text: |
||
|
Query auto-completion (QAC) facilitates user query composition by suggesting queries given query prefix inputs. In 2014, global users of Yahoo! Search saved more than 50% keystrokes when submitting English queries by selecting suggestions of QAC. Users' ...
expand
|
||
| SESSION: Session 2C: Graphs | ||
| Jaap Kamps | ||
| A Random Walk Model for Optimization of Search Impact in Web Frontier Ranking | ||
| Giang Tran, Ata Turk, B. Barla Cambazoglu, Wolfgang Nejdl | ||
| Pages: 153-162 | ||
| doi>10.1145/2766462.2767737 | ||
|
Full text: |
||
|
Large-scale web search engines need to crawl the Web continuously to discover and download newly created web content. The speed at which the new content is discovered and the quality of the discovered content can have a big impact on the coverage and ...
expand
|
||
| A Similarity Measure for Weaving Patterns in Textiles | ||
| Sven Helmer, Vuong Minh Ngo | ||
| Pages: 163-172 | ||
| doi>10.1145/2766462.2767735 | ||
|
Full text: |
||
|
We propose a novel approach for measuring the similarity between weaving patterns that can provide similarity-based search functionality for textile archives. We represent textile structures using hypergraphs and extract multisets of $k$-neighborhoods ...
expand
|
||
| Local Ranking Problem on the BrowseGraph | ||
| Michele Trevisiol, Luca Maria Aiello, Paolo Boldi, Roi Blanco | ||
| Pages: 173-182 | ||
| doi>10.1145/2766462.2767704 | ||
|
Full text: |
||
|
The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink ...
expand
|
||
| SESSION: Session 3A: Search Experience | ||
| Birger Larsen | ||
| How many results per page?: A Study of SERP Size, Search Behavior and User Experience | ||
| Diane Kelly, Leif Azzopardi | ||
| Pages: 183-192 | ||
| doi>10.1145/2766462.2767732 | ||
|
Full text: |
||
|
The provision of "ten blue links" has emerged as the standard for the design of search engine result pages (SERPs). While numerous aspects of SERPs have been examined, little attention has been paid to the number of results displayed per page. This paper ...
expand
|
||
| Influence of Vertical Result in Web Search Examination | ||
| Zeyang Liu, Yiqun Liu, Ke Zhou, Min Zhang, Shaoping Ma | ||
| Pages: 193-202 | ||
| doi>10.1145/2766462.2767714 | ||
|
Full text: |
||
|
Research in how users examine results on search engine result pages (SERPs) helps improve result ranking, advertisement placement, performance evaluation and search UI design. Although examination behavior on organic search results (also known as "ten ...
expand
|
||
| Unconscious Physiological Effects of Search Latency on Users and Their Click Behaviour | ||
| Miguel Barreda-Ángeles, Ioannis Arapakis, Xiao Bai, B. Barla Cambazoglu, Alexandre Pereda-Baños | ||
| Pages: 203-212 | ||
| doi>10.1145/2766462.2767719 | ||
|
Full text: |
||
|
Understanding the impact of a search system's response latency on its users' searching behaviour has been recently an active research topic in the information retrieval and human-computer interaction areas. Along the same line, this paper focuses on ...
expand
|
||
| SESSION: Session 3B: Social Media | ||
| Claudia Hauff | ||
| Multiple Social Network Learning and Its Application in Volunteerism Tendency Prediction | ||
| Xuemeng Song, Liqiang Nie, Luming Zhang, Mohammad Akbari, Tat-Seng Chua | ||
| Pages: 213-222 | ||
| doi>10.1145/2766462.2767726 | ||
|
Full text: |
||
|
We are living in the era of social networks, where people throughout the world are connected and organized by multiple social networks. The views revealed by different social networks may vary according to the different services they offer. They are ...
expand
|
||
| HSpam14: A Collection of 14 Million Tweets for Hashtag-Oriented Spam Research | ||
| Surendra Sedhai, Aixin Sun | ||
| Pages: 223-232 | ||
| doi>10.1145/2766462.2767701 | ||
|
Full text: |
||
|
Hashtag facilitates information diffusion in Twitter by creating dynamic and virtual communities for information aggregation from all Twitter users. Because hashtags serve as additional channels for one's tweets to be potentially accessed by other users ...
expand
|
||
| Uncovering Crowdsourced Manipulation of Online Reviews | ||
| Amir Fayazi, Kyumin Lee, James Caverlee, Anna Squicciarini | ||
| Pages: 233-242 | ||
| doi>10.1145/2766462.2767742 | ||
|
Full text: |
||
|
Online reviews are a cornerstone of consumer decision making. However, their authenticity and quality has proven hard to control, especially as polluters target these reviews toward promoting products or in degrading competitors. In a troubling direction, ...
expand
|
||
| SESSION: Session 3C: Entities | ||
| Krisztian Balog | ||
| Relevance Scores for Triples from Type-Like Relations | ||
| Hannah Bast, Björn Buchhold, Elmar Haussmann | ||
| Pages: 243-252 | ||
| doi>10.1145/2766462.2767734 | ||
|
Full text: |
||
|
We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity "belongs" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, ...
expand
|
||
| Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data | ||
| Nikita Zhiltsov, Alexander Kotov, Fedor Nikolaev | ||
| Pages: 253-262 | ||
| doi>10.1145/2766462.2767756 | ||
|
Full text: |
||
|
Previously proposed approaches to ad-hoc entity retrieval in the Web of Data (ERWD) used multi-fielded representation of entities and relied on standard unigram bag-of-words retrieval models. Although retrieval models incorporating term dependencies ...
expand
|
||
| Mining, Ranking and Recommending Entity Aspects | ||
| Ridho Reinanda, Edgar Meij, Maarten de Rijke | ||
| Pages: 263-272 | ||
| doi>10.1145/2766462.2767724 | ||
|
Full text: |
||
|
Entity queries constitute a large fraction of web search queries and most of these queries are in the form of an entity mention plus some context terms that represent an intent in the context of that entity. We refer to these entity-oriented search intents ...
expand
|
||
| SESSION: Session 4A: User Models | ||
| Diane Kelly | ||
| Bayesian Ranker Comparison Based on Historical User Interactions | ||
| Artem Grotov, Shimon Whiteson, Maarten de Rijke | ||
| Pages: 273-282 | ||
| doi>10.1145/2766462.2767730 | ||
|
Full text: |
||
|
We address the problem of how to safely compare rankers for information retrieval. In particular, we consider how to control the risks associated with switching from an existing production ranker to a new candidate ranker. Whereas existing online comparison ...
expand
|
||
|
|
Incorporating Non-sequential Behavior into Click Models | |
| Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, Shaoping Ma | ||
| Pages: 283-292 | ||
| doi>10.1145/2766462.2767712 | ||
|
Full text: |
||
|
Click-through information is considered as a valuable source of users' implicit relevance feedback. As user behavior is usually influenced by a number of factors such as position, presentation style and site reputation, researchers have proposed a variety ...
expand
|
||
| Untangling Result List Refinement and Ranking Quality: a Framework for Evaluation and Prediction | ||
| Jiyin He, Marc Bron, Arjen de Vries, Leif Azzopardi, Maarten de Rijke | ||
| Pages: 293-302 | ||
| doi>10.1145/2766462.2767740 | ||
|
Full text: |
||
|
Traditional batch evaluation metrics assume that user interaction with search results is limited to scanning down a ranked list. However, modern search interfaces come with additional elements supporting result list refinement (RLR) through facets and ...
expand
|
||
| SESSION: Session 4B: Recommending | ||
| Paul Benett | ||
| WEMAREC: Accurate and Scalable Recommendation through Weighted and Ensemble Matrix Approximation | ||
| Chao Chen, Dongsheng Li, Yingying Zhao, Qin Lv, Li Shang | ||
| Pages: 303-312 | ||
| doi>10.1145/2766462.2767718 | ||
|
Full text: |
||
|
Matrix approximation is one of the most effective methods for collaborative filtering-based recommender systems. However, the high computation complexity of matrix factorization on large datasets limits its scalability. Prior solutions have adopted co-clustering ...
expand
|
||
| Effective Latent Models for Binary Feedback in Recommender Systems | ||
| Maksims Volkovs, Guang Wei Yu | ||
| Pages: 313-322 | ||
| doi>10.1145/2766462.2767716 | ||
|
Full text: |
||
|
In many collaborative filtering (CF) applications, latent approaches are the preferred model choice due to their ability to generate real-time recommendations efficiently. However, the majority of existing latent models are not designed for implicit ...
expand
|
||
| Personalized Recommendation via Parameter-Free Contextual Bandits | ||
| Liang Tang, Yexi Jiang, Lei Li, Chunqiu Zeng, Tao Li | ||
| Pages: 323-332 | ||
| doi>10.1145/2766462.2767707 | ||
|
Full text: |
||
|
Personalized recommendation services have gained increasing popularity and attention in recent years as most useful information can be accessed online in real-time. Most online recommender systems try to address the information needs of users by virtue ...
expand
|
||
| SESSION: Session 4C: Classifying & Ranking | ||
| Yi Chang | ||
| An Efficient and Scalable MetaFeature-based Document Classification Approach based on Massively Parallel Computing | ||
| Sérgio Canuto, Marcos Gonçalves, Wisllay Santos, Thierson Rosa, Wellington Martins | ||
| Pages: 333-342 | ||
| doi>10.1145/2766462.2767743 | ||
|
Full text: |
||
|
The unprecedented growth of available data nowadays has stimulated the development of new methods for organizing and extracting useful knowledge from this immense amount of data. Automatic Document Classification (ADC) is one of such methods, that uses ...
expand
|
||
| Listwise Collaborative Filtering | ||
| Shanshan Huang, Shuaiqiang Wang, Tie-Yan Liu, Jun Ma, Zhumin Chen, Jari Veijalainen | ||
| Pages: 343-352 | ||
| doi>10.1145/2766462.2767693 | ||
|
Full text: |
||
|
Recently, ranking-oriented collaborative filtering (CF) algorithms have achieved great success in recommender systems. They obtained state-of-the-art performances by estimating a preference ranking of items for each user rather than estimating the absolute ...
expand
|
||
| BROOF: Exploiting Out-of-Bag Errors, Boosting and Random Forests for Effective Automated Classification | ||
| Thiago Salles, Marcos Gonçalves, Victor Rodrigues, Leonardo Rocha | ||
| Pages: 353-362 | ||
| doi>10.1145/2766462.2767747 | ||
|
Full text: |
||
|
Random Forests (RF) and Boosting are two of the most successful supervised learning paradigms for automatic classification. In this work we propose to combine both strategies in order to exploit their strengths while simultaneously solving some of their ...
expand
|
||
| SESSION: Session 5A: Deep Learning | ||
| Berthier Ribeiro-Neto | ||
| Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings | ||
| Ivan Vulić, Marie-Francine Moens | ||
| Pages: 363-372 | ||
| doi>10.1145/2766462.2767752 | ||
|
Full text: |
||
|
We propose a new unified framework for monolingual (MoIR) and cross-lingual information retrieval (CLIR) which relies on the induction of dense real-valued word vectors known as word embeddings (WE) from comparable data. To this end, we make several ...
expand
|
||
| Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks | ||
| Aliaksei Severyn, Alessandro Moschitti | ||
| Pages: 373-382 | ||
| doi>10.1145/2766462.2767738 | ||
|
Full text: |
||
|
Learning a similarity function between pairs of objects is at the core of learning to rank approaches. In information retrieval tasks we typically deal with query-document pairs, in question answering -- question-answer pairs. However, before learning ...
expand
|
||
| Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search | ||
| Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Fabrizio Silvestri, Narayan Bhamidipati | ||
| Pages: 383-392 | ||
| doi>10.1145/2766462.2767709 | ||
|
Full text: |
||
|
Search engines represent one of the most popular web services, visited by more than 85% of internet users on a daily basis. Advertisers are interested in making use of this vast business potential, as very clear intent signal communicated through the ...
expand
|
||
| SESSION: Session 5B: Products | ||
| Grace Hui Yang | ||
| Retrieval of Relevant Opinion Sentences for New Products | ||
| Dae Hoon Park, Hyun Duk Kim, ChengXiang Zhai, Lifan Guo | ||
| Pages: 393-402 | ||
| doi>10.1145/2766462.2767748 | ||
|
Full text: |
||
|
With the rapid development of Internet and E-commerce, abundant product reviews have been written by consumers who bought the products. These reviews are very useful for consumers to optimize their purchasing decisions. However, since the reviews are ...
expand
|
||
| Learning Hierarchical Representation Model for NextBasket Recommendation | ||
| Pengfei Wang, Jiafeng Guo, Yanyan Lan, Jun Xu, Shengxian Wan, Xueqi Cheng | ||
| Pages: 403-412 | ||
| doi>10.1145/2766462.2767694 | ||
|
Full text: |
||
|
Next basket recommendation is a crucial task in market basket analysis. Given a user's purchase history, usually a sequence of transaction data, one attempts to build a recommender that can predict the next few items that the user most probably would ...
expand
|
||
| Parametric and Non-parametric User-aware Sentiment Topic Models | ||
| Zaihan Yang, Alexander Kotov, Aravind Mohan, Shiyong Lu | ||
| Pages: 413-422 | ||
| doi>10.1145/2766462.2767758 | ||
|
Full text: |
||
|
The popularity of Web 2.0 has resulted in a large number of publicly available online consumer reviews created by a demographically diverse user base. Information about the authors of these reviews, such as age, gender and location, provided by many ...
expand
|
||
| SESSION: Session 5C: Locations | ||
| Craig Macdonald | ||
| Learning to Extract Local Events from the Web | ||
| John Foley, Michael Bendersky, Vanja Josifovski | ||
| Pages: 423-432 | ||
| doi>10.1145/2766462.2767739 | ||
|
Full text: |
||
|
The goal of this work is extraction and retrieval of local events from web pages. Examples of local events include small venue concerts, theater performances, garage sales, movie screenings, etc. We collect these events in the form of retrievable calendar ...
expand
|
||
| Rank-GeoFM: A Ranking based Geographical Factorization Method for Point of Interest Recommendation | ||
| Xutao Li, Gao Cong, Xiao-Li Li, Tuan-Anh Nguyen Pham, Shonali Krishnaswamy | ||
| Pages: 433-442 | ||
| doi>10.1145/2766462.2767722 | ||
|
Full text: |
||
|
With the rapid growth of location-based social networks, Point of Interest (POI) recommendation has become an important research problem. However, the scarcity of the check-in data, a type of implicit feedback data, poses a severe challenge for existing ...
expand
|
||
| GeoSoCa: Exploiting Geographical, Social and Categorical Correlations for Point-of-Interest Recommendations | ||
| Jia-Dong Zhang, Chi-Yin Chow | ||
| Pages: 443-452 | ||
| doi>10.1145/2766462.2767711 | ||
|
Full text: |
||
|
Recommending users with their preferred points-of-interest (POIs), e.g., museums and restaurants, has become an important feature for location-based social networks (LBSNs), which benefits people to explore new places and businesses to discover potential ...
expand
|
||
| SESSION: Session 6A: Experiment Design | ||
| Alistair Moffat | ||
| Optimised Scheduling of Online Experiments | ||
| Eugene Kharitonov, Craig Macdonald, Pavel Serdyukov, Iadh Ounis | ||
| Pages: 453-462 | ||
| doi>10.1145/2766462.2767706 | ||
|
Full text: |
||
|
Modern search engines increasingly rely on online evaluation methods such as A/B tests and interleaving. These online evaluation methods make use of interactions by the search engine's users to test various changes in the search engine. However, since ...
expand
|
||
| Predicting Search Satisfaction Metrics with Interleaved Comparisons | ||
| Anne Schuth, Katja Hofmann, Filip Radlinski | ||
| Pages: 463-472 | ||
| doi>10.1145/2766462.2767695 | ||
|
Full text: |
||
|
The gold standard for online retrieval evaluation is AB testing. Rooted in the idea of a controlled experiment, AB tests compare the performance of an experimental system (treatment) on one sample of the user population, to that of a baseline system ...
expand
|
||
|
|
Sequential Testing for Early Stopping of Online Experiments | |
| Eugene Kharitonov, Aleksandr Vorobev, Craig Macdonald, Pavel Serdyukov, Iadh Ounis | ||
| Pages: 473-482 | ||
| doi>10.1145/2766462.2767729 | ||
|
Full text: |
||
|
Online evaluation methods, such as A/B and interleaving experiments, are widely used for search engine evaluation. Since they rely on noisy implicit user feedback, running each experiment takes a considerable time. Recently, the problem of reducing the ...
expand
|
||
| SESSION: Session 6B: Predicting | ||
| Djoerd Hiemstra | ||
| Inferring Searcher Attention by Jointly Modeling User Interactions and Content Salience | ||
| Dmitry Lagun, Eugene Agichtein | ||
| Pages: 483-492 | ||
| doi>10.1145/2766462.2767745 | ||
|
Full text: |
||
|
Modeling and predicting user attention is crucial for interpreting search behavior. The numerous applications include quantifying web search satisfaction, estimating search quality, and measuring and predicting online user engagement. While prior research ...
expand
|
||
| Different Users, Different Opinions: Predicting Search Satisfaction with Mouse Movement Information | ||
| Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, Xuan Zhu | ||
| Pages: 493-502 | ||
| doi>10.1145/2766462.2767721 | ||
|
Full text: |
||
|
Satisfaction prediction is one of the prime concerns in search performance evaluation. It is a non-trivial task for two major reasons: (1) The definition of satisfaction is rather subjective and different users may have different opinions in satisfaction ...
expand
|
||
| Predicting Search Intent Based on Pre-Search Context | ||
| Weize Kong, Rui Li, Jie Luo, Aston Zhang, Yi Chang, James Allan | ||
| Pages: 503-512 | ||
| doi>10.1145/2766462.2767757 | ||
|
Full text: |
||
|
While many studies have been conducted on query understanding, there is limited understanding on why users start searches and how to predict search intent. In this paper, we propose to study this important but less explored problem. Our key intuition ...
expand
|
||
| SESSION: Session 6C: Tasks and Devices | ||
| Emine Yilmaz | ||
| Leveraging Procedural Knowledge for Task-oriented Search | ||
| Zi Yang, Eric Nyberg | ||
| Pages: 513-522 | ||
| doi>10.1145/2766462.2767744 | ||
|
Full text: |
||
|
Many search engine users attempt to satisfy an information need by issuing multiple queries, with the expectation that each result will contribute some portion of the required information. Previous research has shown that structured or semi-structured ...
expand
|
||
| Personalizing Search on Shared Devices | ||
| Ryen W. White, Ahmed Hassan Awadallah | ||
| Pages: 523-532 | ||
| doi>10.1145/2766462.2767736 | ||
|
Full text: |
||
|
Search personalization tailors the search experience to individual searchers. To do this, search engines construct interest models comprising signals from observed behavior associated with ma-chines, often via Web browser cookies or other user identifiers. ...
expand
|
||
| Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval | ||
| Dae Hoon Park, Mengwen Liu, ChengXiang Zhai, Haohong Wang | ||
| Pages: 533-542 | ||
| doi>10.1145/2766462.2767759 | ||
|
Full text: |
||
|
Smartphones and tablets with their apps pervaded our everyday life, leading to a new demand for search tools to help users find the right apps to satisfy their immediate needs. While there are a few commercial mobile app search engines available, the ...
expand
|
||
| SESSION: Keynote | ||
| Ricardo Baeza-Yates | ||
| Towards a Game-Theoretic Framework for Information Retrieval | ||
| ChengXiang Zhai | ||
| Pages: 543-543 | ||
| doi>10.1145/2766462.2767853 | ||
|
Full text: |
||
|
The task of information retrieval (IR) has traditionally been defined as to rank a collection of documents in response to a query. While this definition has enabled most research progress in IR so far, it does not model accurately the actual retrieval ...
expand
|
||
| SESSION: Session 7A: Assessing | ||
| Justin Zobel | ||
| Representative & Informative Query Selection for Learning to Rank using Submodular Functions | ||
| Rishabh Mehrotra, Emine Yilmaz | ||
| Pages: 545-554 | ||
| doi>10.1145/2766462.2767753 | ||
|
Full text: |
||
|
The performance of Learning to Rank algorithms strongly depend on the number of labelled queries in the training set, while the cost incurred in annotating a large number of queries with relevance judgements is prohibitively high. As a result, constructing ...
expand
|
||
| Impact of Surrogate Assessments on High-Recall Retrieval | ||
| Adam Roegiest, Gordon V. Cormack, Charles L.A. Clarke, Maura R. Grossman | ||
| Pages: 555-564 | ||
| doi>10.1145/2766462.2767754 | ||
|
Full text: |
||
|
We are concerned with the effect of using a surrogate assessor to train a passive (i.e., batch) supervised-learning method to rank documents for subsequent review, where the effectiveness of the ranking will be evaluated using a different assessor deemed ...
expand
|
||
|
|
The Benefits of Magnitude Estimation Relevance Assessments for Information Retrieval Evaluation | |
| Andrew Turpin, Falk Scholer, Stefano Mizzaro, Eddy Maddalena | ||
| Pages: 565-574 | ||
| doi>10.1145/2766462.2767760 | ||
|
Full text: |
||
|
Magnitude estimation is a psychophysical scaling technique for the measurement of sensation, where observers assign numbers to stimuli in response to their perceived intensity. We investigate the use of magnitude estimation for judging the relevance ...
expand
|
||
| SESSION: Session 7B: Terms | ||
| Arjen de Vries | ||
| Learning to Reweight Terms with Distributed Representations | ||
| Guoqing Zheng, Jamie Callan | ||
| Pages: 575-584 | ||
| doi>10.1145/2766462.2767700 | ||
|
Full text: |
||
|
Term weighting is a fundamental problem in IR research and numerous weighting models have been proposed. Proper term weighting can greatly improve retrieval accuracies, which essentially involves two types of query understanding: interpreting the query ...
expand
|
||
| A Probabilistic Model for Information Retrieval Based on Maximum Value Distribution | ||
| Jiaul H. Paik | ||
| Pages: 585-594 | ||
| doi>10.1145/2766462.2767762 | ||
|
Full text: |
||
|
The main goal of a retrieval model is to measure the degree of relevance of a document with respect to the given query. Probabilistic models are widely used to measure the likelihood of relevance of a document by combining within document term frequency ...
expand
|
||
| Non-Compositional Term Dependence for Information Retrieval | ||
| Christina Lioma, Jakob Grue Simonsen, Birger Larsen, Niels Dalum Hansen | ||
| Pages: 595-604 | ||
| doi>10.1145/2766462.2767717 | ||
|
Full text: |
||
|
Modelling term dependence in IR aims to identify co-occurring terms that are too heavily dependent on each other to be treated as a bag of words, and to adapt the indexing and ranking accordingly. Dependent terms are predominantly identified using lexical ...
expand
|
||
| SESSION: Session 8A: Variability in test collections | ||
| Mark Smucker | ||
| On the Relation Between Assessor's Agreement and Accuracy in Gamified Relevance Assessment | ||
| Olga Megorskaya, Vladimir Kukushkin, Pavel Serdyukov | ||
| Pages: 605-614 | ||
| doi>10.1145/2766462.2767727 | ||
|
Full text: |
||
|
Expert judgments (labels) are widely used in Information Retrieval for the purposes of search quality evaluation and machine learning. Setting up the process of collecting such judgments is a challenge of its own, and the maintenance of judgments quality ...
expand
|
||
| Assessor Differences and User Preferences in Tweet Timeline Generation | ||
| Yulu Wang, Garrick Sherman, Jimmy Lin, Miles Efron | ||
| Pages: 615-624 | ||
| doi>10.1145/2766462.2767699 | ||
|
Full text: |
||
|
In information retrieval evaluation, when presented with an effectiveness difference between two systems, there are three relevant questions one might ask. First, are the differences statistically significant? Second, is the comparison stable with respect ...
expand
|
||
| User Variability and IR System Evaluation | ||
| Peter Bailey, Alistair Moffat, Falk Scholer, Paul Thomas | ||
| Pages: 625-634 | ||
| doi>10.1145/2766462.2767728 | ||
|
Full text: |
||
|
Test collection design eliminates sources of user variability to make statistical comparisons among information retrieval (IR) systems more affordable. Does this choice unnecessarily limit generalizability of the outcomes to real usage scenarios? We ...
expand
|
||
| SESSION: Session 8B: Citations | ||
| Mark Sanderson | ||
| An Entity Class-Dependent Discriminative Mixture Model for Cumulative Citation Recommendation | ||
| Jingang Wang, Dandan Song, Qifan Wang, Zhiwei Zhang, Luo Si, Lejian Liao, Chin-Yew Lin | ||
| Pages: 635-644 | ||
| doi>10.1145/2766462.2767698 | ||
|
Full text: |
||
|
This paper studies Cumulative Citation Recommendation (CCR) for Knowledge Base Acceleration (KBA). The CCR task aims to detect potential citations of a set of target entities with priorities from a volume of temporally-ordered stream corpus. Previous ...
expand
|
||
| Scientific Information Understanding via Open Educational Resources (OER) | ||
| Xiaozhong Liu, Zhuoren Jiang, Liangcai Gao | ||
| Pages: 645-654 | ||
| doi>10.1145/2766462.2767750 | ||
|
Full text: |
||
|
Scientific publication retrieval/recommendation has been investigated in the past decade. However, to the best of our knowledge, few efforts have been made to help junior scholars and graduate students to understand and consume the essence of those scientific ...
expand
|
||
| In Situ Insights | ||
| Yuanhua Lv, Ariel Fuxman | ||
| Pages: 655-664 | ||
| doi>10.1145/2766462.2767696 | ||
|
Full text: |
||
|
When consuming content in applications such as e-readers, word processors, and Web browsers, users often see mentions to topics (or concepts) that attract their attention. In a scenario of significant practical interest, topics are explored in situ, ...
expand
|
||
| SESSION: Session 9A: Streams | ||
| Fernando Diaz | ||
| Islands in the Stream: A Study of Item Recommendation within an Enterprise Social Stream | ||
| Ido Guy, Roy Levin, Tal Daniel, Ella Bolshinsky | ||
| Pages: 665-674 | ||
| doi>10.1145/2766462.2767746 | ||
|
Full text: |
||
|
Social streams allow users to receive updates from their network by syndicating social media activity. These streams have become a popular way to share and consume information both on the web and in the enterprise. With so much activity going on, filtering ...
expand
|
||
| Evaluating Streams of Evolving News Events | ||
| Gaurav Baruah, Mark D. Smucker, Charles L.A. Clarke | ||
| Pages: 675-684 | ||
| doi>10.1145/2766462.2767751 | ||
|
Full text: |
||
|
People track news events according to their interests and available time. For a major event of great personal interest, they might check for updates several times an hour, taking time to keep abreast of all aspects of the evolving event. For minor events ...
expand
|
||
| SESSION: Session 9B: Cards | ||
| Eugene Agichtein | ||
| Information Retrieval as Card Playing: A Formal Model for Optimizing Interactive Retrieval Interface | ||
| Yinan Zhang, Chengxiang Zhai | ||
| Pages: 685-694 | ||
| doi>10.1145/2766462.2767761 | ||
|
Full text: |
||
|
We propose a novel formal model for optimizing interactive information retrieval interfaces. To model interactive retrieval in a general way, we frame the task of an interactive retrieval system as to choose a sequence of interface cards to present to ...
expand
|
||
| From Queries to Cards: Re-ranking Proactive Card Recommendations Based on Reactive Search History | ||
| Milad Shokouhi, Qi Guo | ||
| Pages: 695-704 | ||
| doi>10.1145/2766462.2767705 | ||
|
Full text: |
||
|
The growing accessibility of mobile devices has substantially reformed the way users access information. While the reactive search by query remains as common as before, recent years have witnessed the emergence of various proactive systems such as Google ...
expand
|
||
| SESSION: Short Papers | ||
| Using Sensor Metadata Streams to Identify Topics of Local Events in the City | ||
| M-Dyaa Albakour, Craig Macdonald, Iadh Ounis | ||
| Pages: 711-714 | ||
| doi>10.1145/2766462.2767837 | ||
|
Full text: |
||
|
In this paper, we study the emerging Information Retrieval (IR) task of local event retrieval using sensor metadata streams. Sensor metadata streams include information such as the crowd density from video processing, audio classifications, and social ...
expand
|
||
| StarSum: A Simple Star Graph for Multi-document Summarization | ||
| Mohammed Al-Dhelaan | ||
| Pages: 715-718 | ||
| doi>10.1145/2766462.2767790 | ||
|
Full text: |
||
|
Graph-based approaches for multi-document summarization have been widely used to extract top sentences for a summary. Traditionally, the documents' cluster is modeled as a graph of the cluster's sentences only which might limit the ability of recognizing ...
expand
|
||
| When Relevance Judgement is Happening?: An EEG-based Study | ||
| Marco Allegretti, Yashar Moshfeghi, Maria Hadjigeorgieva, Frank E. Pollick, Joemon M. Jose, Gabriella Pasi | ||
| Pages: 719-722 | ||
| doi>10.1145/2766462.2767811 | ||
|
Full text: |
||
|
Relevance is a central notion in Information Retrieval, but it is considered to be a difficult concept to define. We analyse brain signals for the first 800 milliseconds (ms) of a relevance assessment process to answer the question "when relevance is ...
expand
|
||
| Search Engine Evaluation based on Search Engine Switching Prediction | ||
| Olga Arkhipova, Lidia Grauer, Igor Kuralenok, Pavel Serdyukov | ||
| Pages: 723-726 | ||
| doi>10.1145/2766462.2767786 | ||
|
Full text: |
||
|
In this paper we present a novel application of the search engine switching prediction model for online evaluation. We propose a new metric pSwitch for A/B-testing, which allows us to evaluate the quality of search engines in different aspects such as ...
expand
|
||
| Time-Aware Authorship Attribution for Short Text Streams | ||
| Hosein Azarbonyad, Mostafa Dehghani, Maarten Marx, Jaap Kamps | ||
| Pages: 727-730 | ||
| doi>10.1145/2766462.2767799 | ||
|
Full text: |
||
|
Identifying authors of short texts on Internet or social media based communication systems is an important tool against fraud and cybercrimes. Besides the challenges raised by the limited length of these short messages, evolving language and writing ...
expand
|
||
| A Priori Relevance Based On Quality and Diversity Of Social Signals | ||
| Ismail Badache, Mohand Boughanem | ||
| Pages: 731-734 | ||
| doi>10.1145/2766462.2767807 | ||
|
Full text: |
||
|
Social signals (users' actions) associated with web resources (documents) can be considered as an additional information that can play a role to estimate a priori importance of the resource. In this paper, we are particularly interested in: first, showing ...
expand
|
||
| Document Comprehensiveness and User Preferences in Novelty Search Tasks | ||
| Ashraf Bah, Praveen Chandar, Ben Carterette | ||
| Pages: 735-738 | ||
| doi>10.1145/2766462.2767820 | ||
|
Full text: |
||
|
Different users may be attempting to satisfy different information needs while providing the same query to a search engine. Addressing that issue is addressing Novelty and Diversity in information retrieval. Novelty and Diversity search task models the ...
expand
|
||
| Cost-Aware Result Caching for Meta-Search Engines | ||
| Emre Bakkal, Ismail Sengor Altingovde, Ismail Hakki Toroslu | ||
| Pages: 739-742 | ||
| doi>10.1145/2766462.2767813 | ||
|
Full text: |
||
|
Our goal in this paper is to design cost-aware result caching approaches for meta-search engines. We introduce different levels of eviction, namely, query-, resource- and entry-level, based on the granularity of the entries to be evicted from the cache ...
expand
|
||
| From Unlabelled Tweets to Twitter-specific Opinion Words | ||
| Felipe Bravo-Marquez, Eibe Frank, Bernhard Pfahringer | ||
| Pages: 743-746 | ||
| doi>10.1145/2766462.2767770 | ||
|
Full text: |
||
|
In this article, we propose a word-level classification model for automatically generating a Twitter-specific opinion lexicon from a corpus of unlabelled tweets. The tweets from the corpus are represented by two vectors: a bag-of-words vector and a semantic ...
expand
|
||
| The Best Published Result is Random: Sequential Testing and its Effect on Reported Effectiveness | ||
| Ben Carterette | ||
| Pages: 747-750 | ||
| doi>10.1145/2766462.2767812 | ||
|
Full text: |
||
|
Reusable test collections allow researchers to rapidly test different algorithms to find the one that works "best". But because of randomness in the topic sample, or in relevance judgments, or in interactions among system components, extreme results ...
expand
|
||
| Load-sensitive CPU Power Management for Web Search Engines | ||
| Matteo Catena, Craig Macdonald, Nicola Tonellotto | ||
| Pages: 751-754 | ||
| doi>10.1145/2766462.2767809 | ||
|
Full text: |
||
|
Web search engine companies require power-hungry data centers with thousands of servers to efficiently perform searches on a large scale. This permits the search engines to serve high arrival rates of user queries with low latency, but poses economical ...
expand
|
||
| Retrieval from Noisy E-Discovery Corpus in the Absence of Training Data | ||
| Anirban Chakraborty, Kripabandhu Ghosh, Swapan Kumar Parui | ||
| Pages: 755-758 | ||
| doi>10.1145/2766462.2767828 | ||
|
Full text: |
||
|
OCR errors hurt retrieval performance to a great extent. Research has been done on modelling and correction of OCR errors. However, most of the existing systems use language dependent resources or training texts for studying the nature of errors. Not ...
expand
|
||
| Opinion Spammer Detection in Web Forum | ||
| Yu-Ren Chen, Hsin-Hsi Chen | ||
| Pages: 759-762 | ||
| doi>10.1145/2766462.2767766 | ||
|
Full text: |
||
|
In this paper, a real case study on opinion spammer detection in web forum is presented. We explore user profiles, maximum spamicity of first posts of users, burstiness of registration of user accounts, and frequent poster set to build a model with SVM ...
expand
|
||
| Multi-Faceted Recall of Continuous Active Learning for Technology-Assisted Review | ||
| Gordon V. Cormack, Maura R. Grossman | ||
| Pages: 763-766 | ||
| doi>10.1145/2766462.2767771 | ||
|
Full text: |
||
|
Continuous active learning achieves high recall for technology-assisted review, not only for an overall information need, but also for various facets of that information need, whether explicit or implicit. Through simulations using Cormack and Grossman's ...
expand
|
||
| Time Pressure and System Delays in Information Search | ||
| Anita Crescenzi, Diane Kelly, Leif Azzopardi | ||
| Pages: 767-770 | ||
| doi>10.1145/2766462.2767817 | ||
|
Full text: |
||
|
We report preliminary results of the impact of time pressure and system delays on search behavior from a laboratory study with forty-three participants. To induce time pressure, we randomly assigned half of our study participants to a treatment condition ...
expand
|
||
| How Random Decisions Affect Selective Distributed Search | ||
| Zhuyun Dai, Yubin Kim, Jamie Callan | ||
| Pages: 771-774 | ||
| doi>10.1145/2766462.2767796 | ||
|
Full text: |
||
|
Selective distributed search is a retrieval architecture that reduces search costs by partitioning a corpus into topical shards such that only a few shards need to be searched for each query. Prior research created topical shards by using random seed ...
expand
|
||
| Comparing Approaches for Query Autocompletion | ||
| Giovanni Di Santo, Richard McCreadie, Craig Macdonald, Iadh Ounis | ||
| Pages: 775-778 | ||
| doi>10.1145/2766462.2767829 | ||
|
Full text: |
||
|
Within a search engine, query auto-completion aims to predict the final query the user wants to enter as they type, with the aim of reducing query entry time and potentially preparing the search results in advance of query submission. There are a large ...
expand
|
||
| Sign-Aware Periodicity Metrics of User Engagement for Online Search Quality Evaluation | ||
| Alexey Drutsa | ||
| Pages: 779-782 | ||
| doi>10.1145/2766462.2767814 | ||
|
Full text: |
||
|
Modern Internet companies improve evaluation criteria of their data-driven decision-making that is based on online controlled experiments (also known as A/B tests). The amplitude metrics of user engagement are known to be well sensitive to service changes, ...
expand
|
||
| Modelling Term Dependence with Copulas | ||
| Carsten Eickhoff, Arjen P. de Vries, Thomas Hofmann | ||
| Pages: 783-786 | ||
| doi>10.1145/2766462.2767831 | ||
|
Full text: |
||
|
Many generative language and relevance models assume conditional independence between the likelihood of observing individual terms. This assumption is obviously naive, but also hard to replace or relax. There are only very few term pairs that actually ...
expand
|
||
| Modeling Website Topic Cohesion at Scale to Improve Webpage Classification | ||
| Dhivya Eswaran, Paul N. Bennett, Joseph J. Pfeiffer, III | ||
| Pages: 787-790 | ||
| doi>10.1145/2766462.2767834 | ||
|
Full text: |
||
|
Considerable work in web page classification has focused on incorporating the topical structure of the web (e.g., the hyperlink graph) to improve prediction accuracy. However, the majority of work has primarily focused on relational or graph-based methods ...
expand
|
||
| Topic-centric Classification of Twitter User's Political Orientation | ||
| Anjie Fang, Iadh Ounis, Philip Habel, Craig Macdonald, Nut Limsopatham | ||
| Pages: 791-794 | ||
| doi>10.1145/2766462.2767833 | ||
|
Full text: |
||
|
In the recent Scottish Independence Referendum (hereafter, IndyRef), Twitter offered a broad platform for people to express their opinions, with millions of IndyRef tweets posted over the campaign period. In this paper, we aim to classify people's voting ...
expand
|
||
| Word Embedding based Generalized Language Model for Information Retrieval | ||
| Debasis Ganguly, Dwaipayan Roy, Mandar Mitra, Gareth J.F. Jones | ||
| Pages: 795-798 | ||
| doi>10.1145/2766462.2767780 | ||
|
Full text: |
||
|
Word2vec, a state-of-the-art word embedding technique has gained a lot of interest in the NLP community. The embedding of the word vectors helps to retrieve a list of words that are used in similar contexts with respect to a given word. In this paper, ...
expand
|
||
| A Head-Weighted Gap-Sensitive Correlation Coefficient | ||
| Ning Gao, Douglas Oard | ||
| Pages: 799-802 | ||
| doi>10.1145/2766462.2767793 | ||
|
Full text: |
||
|
Information retrieval systems rank documents, and shared-task evaluations yield results that can be used to rank information retrieval systems. Comparing rankings in ways that can yield useful insights is thus an important capability. When making such ...
expand
|
||
| On Term Selection Techniques for Patent Prior Art Search | ||
| Mona Golestan Far, Scott Sanner, Mohamed Reda Bouadjenek, Gabriela Ferraro, David Hawking | ||
| Pages: 803-806 | ||
| doi>10.1145/2766462.2767801 | ||
|
Full text: |
||
|
In this paper, we investigate the influence of term selection on retrieval performance on the CLEF-IP prior art test collection, using the Description section of the patent query with Language Model (LM) and BM25 scoring functions. We find that an oracular ...
expand
|
||
| Automatic Feature Generation on Heterogeneous Graph for Music Recommendation | ||
| Chun Guo, Xiaozhong Liu | ||
| Pages: 807-810 | ||
| doi>10.1145/2766462.2767808 | ||
|
Full text: |
||
|
Online music streaming services (MSS) experienced exponential growth over the past decade. The giant MSS providers not only built massive music collection with metadata, they also accumulated large amount of heterogeneous data generated from users, e.g. ...
expand
|
||
| Differences in Eye-Tracking Measures Between Visits and Revisits to Relevant and Irrelevant Web Pages | ||
| Jacek Gwizdka, Yinglong Zhang | ||
| Pages: 811-814 | ||
| doi>10.1145/2766462.2767795 | ||
|
Full text: |
||
|
This short paper presents initial results from a project, in which we investigated differences in how users view relevant and irrelevant Web pages on their visits and revisits. The users' viewing of Web pages was characterized by eye-tracking measures, ...
expand
|
||
| Reducing Hubness: A Cause of Vulnerability in Recommender Systems | ||
| Kazuo Hara, Ikumi Suzuki, Kei Kobayashi, Kenji Fukumizu | ||
| Pages: 815-818 | ||
| doi>10.1145/2766462.2767823 | ||
|
Full text: |
||
|
It is known that memory-based collaborative filtering systems are vulnerable to shilling attacks. In this paper, we demonstrate that hubness, which occurs in high dimensional data, is exploited by the attacks. Hence we explore methods for reducing hubness ...
expand
|
||
| Modularity-Based Query Clustering for Identifying Users Sharing a Common Condition | ||
| Maayan Gal-On Harel, Elad Yom-Tov | ||
| Pages: 819-822 | ||
| doi>10.1145/2766462.2767798 | ||
|
Full text: |
||
|
We present an algorithm for identifying users who share a common condition from anonymized search engine logs. Input to the algorithm is a set of seed phrases that identify users with the condition of interest with high precision albeit at a very low ...
expand
|
||
| Understanding Temporal Query Intent | ||
| Mohammed Hasanuzzaman, Sriparna Saha, Gaël Dias, Stéphane Ferrari | ||
| Pages: 823-826 | ||
| doi>10.1145/2766462.2767792 | ||
|
Full text: |
||
|
Understanding the temporal orientation of web search queries is an important issue for the success of information access systems. In this paper, we propose a multi-objective ensemble learning solution that (1) allows to accurately classify queries along ...
expand
|
||
| On the Reusability of Open Test Collections | ||
| Seyyed Hadi Hashemi, Charles L.A. Clarke, Adriel Dean-Hall, Jaap Kamps, Julia Kiseleva | ||
| Pages: 827-830 | ||
| doi>10.1145/2766462.2767788 | ||
|
Full text: |
||
|
Creating test collections for modern search tasks is increasingly more challenging due to the growing scale and dynamic nature of content, and need for richer contextualization of the statements of request. To address these issues, the TREC Contextual ...
expand
|
||
| Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis | ||
| Stefan Heindorf, Martin Potthast, Benno Stein, Gregor Engels | ||
| Pages: 831-834 | ||
| doi>10.1145/2766462.2767804 | ||
|
Full text: |
||
|
We report on the construction of the Wikidata Vandalism Corpus WDVC-2015, the first corpus for vandalism in knowledge bases. Our corpus is based on the entire revision history of Wikidata, the knowledge base underlying Wikipedia. Among Wikidata's 24 ...
expand
|
||
| About the 'Compromised Information Need' and Optimal Interaction as Quality Measure for Search Interfaces | ||
| Eduard C. Hoenkamp | ||
| Pages: 835-838 | ||
| doi>10.1145/2766462.2767800 | ||
|
Full text: |
||
|
Taylor's concept of levels of information need has been cited in over a hundred IR publications since his work was first published. It concerns the phases a searcher goes through, starting with the feeling that information seems missing, to expressing ...
expand
|
||
| I See You: Person-of-Interest Search in Social Networks | ||
| Hsun-Ping Hsieh, Cheng-Te Li, Rui Yan | ||
| Pages: 839-842 | ||
| doi>10.1145/2766462.2767767 | ||
|
Full text: |
||
|
Searching for a particular person by specifying her name is one of the essential functions in online social networking services such as Facebook. So many times, however, one would like to find a person but what she knows is few social labels about the ...
expand
|
||
| Towards Quantifying the Impact of Non-Uniform Information Access in Collaborative Information Retrieval | ||
| Nyi Nyi Htun, Martin Halvey, Lynne Baillie | ||
| Pages: 843-846 | ||
| doi>10.1145/2766462.2767779 | ||
|
Full text: |
||
|
The majority of research into Collaborative Information Retrieval (CIR) has assumed a uniformity of information access and visibility between collaborators. However in a number of real world scenarios, information access is not uniform between all collaborators ...
expand
|
||
| Features of Disagreement Between Retrieval Effectiveness Measures | ||
| Timothy Jones, Paul Thomas, Falk Scholer, Mark Sanderson | ||
| Pages: 847-850 | ||
| doi>10.1145/2766462.2767824 | ||
|
Full text: |
||
|
Many IR effectiveness measures are motivated from intuition, theory, or user studies. In general, most effectiveness measures are well correlated with each other. But, what about where they don't correlate? Which rankings cause measures to disagree? ...
expand
|
||
| Subsequence Search in Event-Interval Sequences | ||
| Orestis Kostakis Kostakis, Aristides Gionis Gionis | ||
| Pages: 851-854 | ||
| doi>10.1145/2766462.2767778 | ||
|
Full text: |
||
|
We study the problem of subsequence search in databases of event-interval sequences, or e-sequences. In contrast to sequences of instantaneous events, e-sequences contain events that have a duration. In Information Retrieval applications, e-sequences ...
expand
|
||
| Searcher in a Strange Land: Understanding Web Search from Familiar and Unfamiliar Locations | ||
| Elad Kravi, Eugene Agichtein, Ido Guy, Yaron Kanza, Avihai Mejer, Dan Pelleg | ||
| Pages: 855-858 | ||
| doi>10.1145/2766462.2767782 | ||
|
Full text: |
||
|
With mobile devices, web search is no longer limited to specific locations. People conduct search from practically anywhere, including at home, at work, when traveling and when on vacation. How should this influence search tools and web services? In ...
expand
|
||
| Evaluating Retrieval Models through Histogram Analysis | ||
| Kriste Krstovski, David A. Smith, Michael J. Kurtz | ||
| Pages: 859-862 | ||
| doi>10.1145/2766462.2767821 | ||
|
Full text: |
||
|
We present a novel approach for efficiently evaluating the performance of retrieval models and introduce two evaluation metrics: Distributional Overlap (DO), which compares the clustering of scores of relevant and non-relevant documents, and Histogram ...
expand
|
||
| Inter-Category Variation in Location Search | ||
| Chia-Jung Lee, Nick Craswell, Vanessa Murdock | ||
| Pages: 863-866 | ||
| doi>10.1145/2766462.2767797 | ||
|
Full text: |
||
|
When searching for place entities such as businesses or points of interest, the desired place may be close (finding the nearest ATM) or far away (finding a hotel in another city). Understanding the role of distance in predicting user interests can guide ...
expand
|
||
| Reachability based Ranking in Interactive Image Retrieval | ||
| Jiyi Li | ||
| Pages: 867-870 | ||
| doi>10.1145/2766462.2767777 | ||
|
Full text: |
||
|
In some interactive image retrieval systems, users can select images from image search results and click to view their similar or related images until they reach the targets. Existing image ranking options are based on relevance, update time, interestingness ...
expand
|
||
| Modeling Multi-query Retrieval Tasks Using Density Matrix Transformation | ||
| Qiuchi Li, Jingfei Li, Peng Zhang, Dawei Song | ||
| Pages: 871-874 | ||
| doi>10.1145/2766462.2767819 | ||
|
Full text: |
||
|
The quantum probabilistic framework has recently been applied to Information Retrieval (IR). A representative is the Quantum Language Model (QLM), which is developed for the ad-hoc retrieval with single queries and has achieved significant improvements ...
expand
|
||
| Predicting User Behavior in Display Advertising via Dynamic Collective Matrix Factorization | ||
| Sheng Li, Jaya Kawale, Yun Fu | ||
| Pages: 875-878 | ||
| doi>10.1145/2766462.2767781 | ||
|
Full text: |
||
|
Conversion prediction and click prediction are two important and intertwined problems in display advertising, but existing approaches usually look at them in isolation. In this paper, we aim to predict the conversion response of users by jointly examining ...
expand
|
||
| Zero-shot Image Tagging by Hierarchical Semantic Embedding | ||
| Xirong Li, Shuai Liao, Weiyu Lan, Xiaoyong Du, Gang Yang | ||
| Pages: 879-882 | ||
| doi>10.1145/2766462.2767773 | ||
|
Full text: |
||
|
Given the difficulty of acquiring labeled examples for many fine-grained visual classes, there is an increasing interest in zero-shot image tagging, aiming to tag images with novel labels that have no training examples present. Using a semantic space ...
expand
|
||
| Using Term Location Information to Enhance Probabilistic Information Retrieval | ||
| Baiyan Liu, Xiangdong An, Jimmy Xiangji Huang | ||
| Pages: 883-886 | ||
| doi>10.1145/2766462.2767827 | ||
|
Full text: |
||
|
Nouns are more important than other parts of speech in information retrieval and are more often found near the beginning or the end of sentences. In this paper, we investigate the effects of rewarding terms based on their location in sentences on information ...
expand
|
||
| Learning Context-aware Latent Representations for Context-aware Collaborative Filtering | ||
| Xin Liu, Wei Wu | ||
| Pages: 887-890 | ||
| doi>10.1145/2766462.2767775 | ||
|
Full text: |
||
|
In this paper, we propose a generic framework to learn context-aware latent representations for context-aware collaborative filtering. Contextual contents are combined via a function to produce the context influence factor, which is then combined with ...
expand
|
||
| Exploiting User and Business Attributes for Personalized Business Recommendation | ||
| Kai Lu, Yi Zhang, Lanbo Zhang, Shuxin Wang | ||
| Pages: 891-894 | ||
| doi>10.1145/2766462.2767806 | ||
|
Full text: |
||
|
Data sparsity and cold-start are two major problems in personalized recommendation. They are especially severe in business recommendation, because business transactions are usually completed offline and customers generally do not provide ratings after ...
expand
|
||
| Speeding up Document Ranking with Rank-based Features | ||
| Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto | ||
| Pages: 895-898 | ||
| doi>10.1145/2766462.2767776 | ||
|
Full text: |
||
|
Learning to Rank (LtR) is an effective machine learning methodology for inducing high-quality document ranking functions. Given a query and a candidate set of documents, where query-document pairs are represented by feature vectors, a machine-learned ...
expand
|
||
| Mining Measured Information from Text | ||
| Arun S. Maiya, Dale Visser, Andrew Wan | ||
| Pages: 899-902 | ||
| doi>10.1145/2766462.2767789 | ||
|
Full text: |
||
|
We present an approach to extract measured information from text (e.g., a $1370~^{\circ}C$ melting point, a BMI greater than 29.9 kg/m$^2$). Such extractions are critically important across a wide range of domains --- especially those involving search ...
expand
|
||
| An Initial Investigation into Fixed and Adaptive Stopping Strategies | ||
| David Maxwell, Leif Azzopardi, Kalervo Järvelin, Heikki Keskustalo | ||
| Pages: 903-906 | ||
| doi>10.1145/2766462.2767802 | ||
|
Full text: |
||
|
Most models, measures and simulations often assume that a searcher will stop at a predetermined place in a ranked list of results. However, during the course of a search session, real-world searchers will vary and adapt their interactions with a ranked ...
expand
|
||
| Regularised Cross-Modal Hashing | ||
| Sean Moran, Victor Lavrenko | ||
| Pages: 907-910 | ||
| doi>10.1145/2766462.2767816 | ||
|
Full text: |
||
|
In this paper we propose Regularised Cross-Modal Hashing (RCMH) a new cross-modal hashing model that projects annotation and visual feature descriptors into a common Hamming space. RCMH optimises the hashcode similarity of related data-points in the ...
expand
|
||
| Adapted B-CUBED Metrics to Unbalanced Datasets | ||
| Jose G. Moreno, Gaël Dias | ||
| Pages: 911-914 | ||
| doi>10.1145/2766462.2767836 | ||
|
Full text: |
||
|
B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, ...
expand
|
||
| A Time-aware Random Walk Model for Finding Important Documents in Web Archives | ||
| Tu Ngoc Nguyen, Nattiya Kanhabua, Claudia Niederée, Xiaofei Zhu | ||
| Pages: 915-918 | ||
| doi>10.1145/2766462.2767832 | ||
|
Full text: |
||
|
Due to their first-hand, diverse and evolution-aware reflection of nearly all areas of life, web archives are emerging as gold-mines for content analytics of many sorts. However, supporting search, which goes beyond navigational search via URLs, is a ...
expand
|
||
| A Test Collection for Spoken Gujarati Queries | ||
| Douglas W. Oard, Rashmi Sankepally, Jerome White, Aren Jansen, Craig Harman | ||
| Pages: 919-922 | ||
| doi>10.1145/2766462.2767791 | ||
|
Full text: |
||
|
The development of a new test collection is described in which the task is to search naturally occurring spoken content using naturally occurring spoken queries. To support research on speech retrieval for low-resource settings, the collection includes ...
expand
|
||
| Discovering Experts across Multiple Domains | ||
| Aditya Pal | ||
| Pages: 923-926 | ||
| doi>10.1145/2766462.2767774 | ||
|
Full text: |
||
|
Researchers have focused on finding experts in individual domains, such as emails, forums, question answering, blogs, and microblogs. In this paper, we propose an algorithm for finding experts across these different domains. To do this, we propose an ...
expand
|
||
| Using Key Concepts in a Translation Model for Retrieval | ||
| Jae Hyun Park, W. Bruce Croft | ||
| Pages: 927-930 | ||
| doi>10.1145/2766462.2767768 | ||
|
Full text: |
||
|
Many queries, especially those in the form of longer questions, contain a subset of terms representing key concepts that describe the most important part of the user's information need. Detecting the key concepts in a query can be used as the basis for ...
expand
|
||
| On the Cost of Phrase-Based Ranking | ||
| Matthias Petri, Alistair Moffat | ||
| Pages: 931-934 | ||
| doi>10.1145/2766462.2767769 | ||
|
Full text: |
||
|
Effective postings list compression techniques, and the efficiency of postings list processing schemes such as WAND, have significantly improved the practical performance of ranked document retrieval using inverted indexes. Recently, suffix array-based ...
expand
|
||
| Location-Aware Model for News Events in Social Media | ||
| Mauricio Quezada, Vanessa Peña-Araya, Barbara Poblete | ||
| Pages: 935-938 | ||
| doi>10.1145/2766462.2767815 | ||
|
Full text: |
||
|
Nowadays, social media services are being used extensively as news sources and for spreading information on real-world events. Several studies have focused on detecting those events and locating them geographically. However, in order to study real-world ...
expand
|
||
| Exploring Opportunities to Facilitate Serendipity in Search | ||
| Ataur Rahman, Max L. Wilson | ||
| Pages: 939-942 | ||
| doi>10.1145/2766462.2767783 | ||
|
Full text: |
||
|
Serendipitously discovering new information can bring many benefits. Although we can design systems to highlight serendipitous information, serendipity cannot be easily orchestrated and is thus hard to study. In this paper, we deployed a working search ...
expand
|
||
| Combining Orthogonal Information in Large-Scale Cross-Language Information Retrieval | ||
| Shigehiko Schamoni, Stefan Riezler | ||
| Pages: 943-946 | ||
| doi>10.1145/2766462.2767805 | ||
|
Full text: |
||
|
System combination is an effective strategy to boost retrieval performance, especially in complex applications such as cross-language information retrieval (CLIR) where the aspects of translation and retrieval have to be optimized jointly. We focus on ...
expand
|
||
| Tailoring Music Recommendations to Users by Considering Diversity, Mainstreaminess, and Novelty | ||
| Markus Schedl, David Hauger | ||
| Pages: 947-950 | ||
| doi>10.1145/2766462.2767763 | ||
|
Full text: |
||
|
A shortcoming of current approaches for music recommendation is that they consider user-specific characteristics only on a very simple level, typically as some kind of interaction between users and items when employing collaborative filtering. To alleviate ...
expand
|
||
| Challenges of Mathematical Information Retrievalin the NTCIR-11 Math Wikipedia Task | ||
| Moritz Schubotz, Abdou Youssef, Volker Markl, Howard S. Cohl | ||
| Pages: 951-954 | ||
| doi>10.1145/2766462.2767787 | ||
|
Full text: |
||
|
Mathematical Information Retrieval concerns retrieving information related to a particular mathematical concept. The NTCIR-11 Math Task develops an evaluation test collection for document sections retrieval of scientific articles based on human generated ...
expand
|
||
| Probabilistic Multileave for Online Retrieval Evaluation | ||
| Anne Schuth, Robert-Jan Bruintjes, Fritjof Buüttner, Joost van Doorn, Carla Groenland, Harrie Oosterhuis, Cong-Nguyen Tran, Bas Veeling, Jos van der Velde, Roger Wechsler, David Woudenberg, Maarten de Rijke | ||
| Pages: 955-958 | ||
| doi>10.1145/2766462.2767838 | ||
|
Full text: |
||
|
Online evaluation methods for information retrieval use implicit signals such as clicks from users to infer preferences between rankers. A highly sensitive way of inferring these preferences is through interleaved comparisons. Recently, interleaved comparisons ...
expand
|
||
| Twitter Sentiment Analysis with Deep Convolutional Neural Networks | ||
| Aliaksei Severyn, Alessandro Moschitti | ||
| Pages: 959-962 | ||
| doi>10.1145/2766462.2767830 | ||
|
Full text: |
||
|
This paper describes our deep learning system for sentiment analysis of tweets. The main contribution of this work is a new model for initializing the parameter weights of the convolutional neural network, which is crucial to train an accurate model ...
expand
|
||
| Anchoring and Adjustment in Relevance Estimation | ||
| Milad Shokouhi, Ryen White, Emine Yilmaz | ||
| Pages: 963-966 | ||
| doi>10.1145/2766462.2767841 | ||
|
Full text: |
||
|
People's tendency to overly rely on prior information has been well studied in psychology in the context of anchoring and adjustment. Anchoring biases pervade many aspects of human behavior. In this paper, we present a study of anchoring bias in information ...
expand
|
||
| Cognitive Activity during Web Search | ||
| Md. Hedayetul Islam Shovon, D (Nanda) Nandagopal, Jia Tina Du, Ramasamy Vijayalakshmi, Bernadine Cocks | ||
| Pages: 967-970 | ||
| doi>10.1145/2766462.2767784 | ||
|
Full text: |
||
|
Searching on the Web or Net-surfing is a part of everyday life for many people, but little is known about the brain activity during Web searching. Such knowledge is essential for better understanding of the cognitive demands imposed by the search system ...
expand
|
||
| Personalized Semantic Ranking for Collaborative Recommendation | ||
| Song Xu, Shu Wu, Liang Wang | ||
| Pages: 971-974 | ||
| doi>10.1145/2766462.2767772 | ||
|
Full text: |
||
|
Recently a ranking view of collaborative recommendation has received much attention in recommendation systems. Most of existing ranking approaches are based on pairwise assumption, i.e., everything that has not been selected is of less interest for a ...
expand
|
||
| Active Learning for Entity Filtering in Microblog Streams | ||
| Damiano Spina, Maria-Hendrike Peetz, Maarten de Rijke | ||
| Pages: 975-978 | ||
| doi>10.1145/2766462.2767839 | ||
|
Full text: |
||
|
Monitoring the reputation of entities such as companies or brands in microblog streams (e.g., Twitter) starts by selecting mentions that are related to the entity of interest. Entities are often ambiguous (e.g., "Jaguar" or "Ford") and effective methods ...
expand
|
||
| Relevance-aware Filtering of Tuples Sorted by an Attribute Value via Direct Optimization of Search Quality Metrics | ||
| Nikita V. Spirin, Mikhail Kuznetsov, Julia Kiseleva, Yaroslav V. Spirin, Pavel A. Izhutov | ||
| Pages: 979-982 | ||
| doi>10.1145/2766462.2767822 | ||
|
Full text: |
||
|
Sorting tuples by an attribute value is a common search scenario and many search engines support such capabilities, e.g. price-based sorting in e-commerce, time-based sorting on a job or social media website. However, sorting purely by the attribute ...
expand
|
||
| Multi-source Information Fusion for Personalized Restaurant Recommendation | ||
| Jing Sun, Yun Xiong, Yangyong Zhu, Junming Liu, Chu Guan, Hui Xiong | ||
| Pages: 983-986 | ||
| doi>10.1145/2766462.2767818 | ||
|
Full text: |
||
|
In this paper, we study the problem of personalized restaurant recommendations. Specifically, we develop a probabilistic factor analysis framework, named RMSQ-MF, which has the ability in exploiting multi-source information, such as the users' task, ...
expand
|
||
| Joint Matrix Factorization and Manifold-Ranking for Topic-Focused Multi-Document Summarization | ||
| Jiwei Tan, Xiaojun Wan, Jianguo Xiao | ||
| Pages: 987-990 | ||
| doi>10.1145/2766462.2767765 | ||
|
Full text: |
||
|
Manifold-ranking has proved to be an effective method for topic-focused multi-document summarization. As basic manifold-ranking based summarization method constructs the relationships between sentences simply by the bag-of-words cosine similarity, we ...
expand
|
||
| Towards Understanding the Impact of Length in Web Search Result Summaries over a Speech-only Communication Channel | ||
| Johanne R. Trippas, Damiano Spina, Mark Sanderson, Lawrence Cavedon | ||
| Pages: 991-994 | ||
| doi>10.1145/2766462.2767826 | ||
|
Full text: |
||
|
Presenting search results over a speech-only communication channel involves a number of challenges for users due to cognitive limitations and the serial nature of speech. We investigated the impact of search result summary length in speech-based web ...
expand
|
||
| Early Detection of Topical Expertise in Community Question Answering | ||
| David van Dijk, Manos Tsagkias, Maarten de Rijke | ||
| Pages: 995-998 | ||
| doi>10.1145/2766462.2767840 | ||
|
Full text: |
||
|
We focus on detecting potential topical experts in community question answering platforms early on in their lifecycle. We use a semi-supervised machine learning approach. We extract three types of feature: (i) textual, (ii) behavioral, and (iii) time-aware, ...
expand
|
||
| LBMCH: Learning Bridging Mapping for Cross-modal Hashing | ||
| Yang Wang, Xuemin Lin, Lin Wu, Wenjie Zhang, Qing Zhang | ||
| Pages: 999-1002 | ||
| doi>10.1145/2766462.2767825 | ||
|
Full text: |
||
|
Hashing has gained considerable attention on large-scale similarity search, due to its enjoyable efficiency and low storage cost. In this paper, we study the problem of learning hash functions in the context of multi-modal data for cross-modal similarity ...
expand
|
||
| Gibberish, Assistant, or Master?: Using Tweets Linking to News for Extractive Single-Document Summarization | ||
| Zhongyu Wei, Wei Gao | ||
| Pages: 1003-1006 | ||
| doi>10.1145/2766462.2767835 | ||
|
Full text: |
||
|
Single-document summarization is a challenging task. In this paper, we explore effective ways using the tweets linking to news for generating extractive summary of each document. We reveal the very basic value of tweets that can be utilized by regarding ...
expand
|
||
| Context-aware Point-of-Interest Recommendation Using Tensor Factorization with Social Regularization | ||
| Lina Yao, Quan Z. Sheng, Yongrui Qin, Xianzhi Wang, Ali Shemshadi, Qi He | ||
| Pages: 1007-1010 | ||
| doi>10.1145/2766462.2767794 | ||
|
Full text: |
||
|
Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks in recent years. Compared with traditional tasks, it focuses more on personalized, context-aware recommendation ...
expand
|
||
| Adaptive User Engagement Evaluation via Multi-task Learning | ||
| Hamed Zamani, Pooya Moradi, Azadeh Shakery | ||
| Pages: 1011-1014 | ||
| doi>10.1145/2766462.2767785 | ||
|
Full text: |
||
|
User engagement evaluation task in social networks has recently attracted considerable attention due to its applications in recommender systems. In this task, the posts containing users' opinions about items, e.g., the tweets containing the users' ratings ...
expand
|
||
| Compact Snippet Caching for Flash-based Search Engines | ||
| Rui Zhang, Pengyu Sun, Jiancong Tong, Rebecca Jane Stones, Gang Wang, Xiaoguang Liu | ||
| Pages: 1015-1018 | ||
| doi>10.1145/2766462.2767764 | ||
|
Full text: |
||
|
In response to a user query, search engines return the top-k relevant results, each of which contains a small piece of text, called a snippet, extracted from the corresponding document. Obtaining a snippet is time consuming as it requires both document ...
expand
|
||
| When Personalization Meets Conformity: Collective Similarity based Multi-Domain Recommendation | ||
| Xi Zhang, Jian Cheng, Shuang Qiu, Zhenfeng Zhu, Hanqing Lu | ||
| Pages: 1019-1022 | ||
| doi>10.1145/2766462.2767810 | ||
|
Full text: |
||
|
Existing recommender systems place emphasis on personalization to achieve promising accuracy. However, in the context of multiple domain, users are likely to seek the same behaviors as domain authorities. This conformity effect provides a wealth of prior ...
expand
|
||
| Sub-document Timestamping of Web Documents | ||
| Yue Zhao, Claudia Hauff | ||
| Pages: 1023-1026 | ||
| doi>10.1145/2766462.2767803 | ||
|
Full text: |
||
|
Knowledge about a (Web) document's creation time has been shown to be an important factor in various temporal information retrieval settings. Commonly, it is assumed that such documents were created at a single point in time. While this assumption may ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations | ||
| DINFRA: A One Stop Shop for Computing Multilingual Semantic Relatedness | ||
| Siamak Barzegar, Juliano Efson Sales, Andre Freitas, Siegfried Handschuh, Brian Davis | ||
| Pages: 1027-1028 | ||
| doi>10.1145/2766462.2767870 | ||
|
Full text: |
||
|
This demonstration presents an infrastructure for computing multilingual semantic relatedness and correlation for twelve natural languages by using three distributional semantic models (DSMs). Our demonsrator - DInfra (Distributional Infrastructure) ...
expand
|
||
| VenueMusic: A Venue-Aware Music Recommender System | ||
| Zhiyong Cheng, Jialie Shen | ||
| Pages: 1029-1030 | ||
| doi>10.1145/2766462.2767869 | ||
|
Full text: |
||
|
Users' music preferences can be greatly influenced by their location and environment nearby. In this demonstration, we present an intelligent music recommender system, called VenueMusic, to automatically identify suitable music for various popular venues ...
expand
|
||
| Shiny on Your Crazy Diagonal | ||
| Giorgio Maria Di Nunzio | ||
| Pages: 1031-1032 | ||
| doi>10.1145/2766462.2767867 | ||
|
Full text: |
||
|
In this demo, we present a web application which allows users to interact with two retrieval models, namely the Binary Independence Model (BIM) and the BM25 model, on a standard TREC collection. The goal of this demo is to give students deeper insight ...
expand
|
||
| CricketLinking: Linking Event Mentions from Cricket Match Reports to Ball Entities in Commentaries | ||
| Manish Gupta | ||
| Pages: 1033-1034 | ||
| doi>10.1145/2766462.2767865 | ||
|
Full text: |
||
|
The 2011 Cricket World Cup final match was watched by around 135 million people. Such a huge viewership demands a great experience for users of online cricket portals. Many portals like espncricinfo.com host a variety of content related to recent matches ...
expand
|
||
| An Aspect-driven Social Media Explorer | ||
| Nedim Lipka, W. Bruce Croft | ||
| Pages: 1035-1036 | ||
| doi>10.1145/2766462.2767864 | ||
|
Full text: |
||
|
We demonstrate an exploration tool that organizes social media content under diverse aspects enabling comprehensive explorations. Unlike existing approaches that group content by trending topics, we present a holistic view of diverse and relevant content ...
expand
|
||
| ERICA: Expert Guidance in Validating Crowd Answers | ||
| Nguyen Quoc Viet Hung, Duong Chi Thang, Matthias Weidlich, Karl Aberer | ||
| Pages: 1037-1038 | ||
| doi>10.1145/2766462.2767866 | ||
|
Full text: |
||
|
Crowdsourcing became an essential tool for a broad range of Web applications. Yet, the wide-ranging levels of expertise of crowd workers as well as the presence of faulty workers call for quality control of the crowdsourcing result. To this end, many ...
expand
|
||
| Large-scale Image Retrieval using Neural Net Descriptors | ||
| David Novak, Michal Batko, Pavel Zezula | ||
| Pages: 1039-1040 | ||
| doi>10.1145/2766462.2767868 | ||
|
Full text: |
||
| Galean: Visualization of Geolocated News Events from Social Media | ||
| Vanessa Peña-Araya, Mauricio Quezada, Barbara Poblete | ||
| Pages: 1041-1042 | ||
| doi>10.1145/2766462.2767862 | ||
|
Full text: |
||
|
Online Social Networks (OSN) have changed the way information is produced and consumed. Organizing and retrieving unstructured data extracted from these platforms is not an easy task. Galean is a visual and interactive tool that aims to help journalists ...
expand
|
||
| SciNet: Interactive Intent Modeling for Information Discovery | ||
| Tuukka Ruotsalo, Jaakko Peltonen, Manuel J.A. Eugster, Dorota Głowacka, Aki Reijonen, Giulio Jacucci, Petri Myllymäki, Samuel Kaski | ||
| Pages: 1043-1044 | ||
| doi>10.1145/2766462.2767863 | ||
|
Full text: |
||
|
Current search engines offer limited assistance for exploration and information discovery in complex search tasks. Instead, users are distracted by the need to focus their cognitive efforts on finding navigation cues, rather than selecting relevant information. ...
expand
|
||
| Linse: A Distributional Semantics Entity Search Engine | ||
| Juliano Efson Sales, André Freitas, Siegfried Handschuh, Brian Davis | ||
| Pages: 1045-1046 | ||
| doi>10.1145/2766462.2767871 | ||
|
Full text: |
||
|
Entering 'Football Players from United States' when searching for 'American Footballers' is an example of vocabulary mismatch, which occurs when different words are used to express the same concepts. In order to address this phenomenon for entity search ...
expand
|
||
| Online News Tracking for Ad-Hoc Queries | ||
| Jeroen B.P. Vuurens, Arjen P. de Vries, Roi Blanco, Peter Mika | ||
| Pages: 1047-1048 | ||
| doi>10.1145/2766462.2767872 | ||
|
Full text: |
||
|
Following news about a specific event can be a difficult task as new information is often scattered across web pages. An up-to-date summary of the event would help to inform users and allow them to navigate to articles that are likely to contain relevant ...
expand
|
||
| DUMPLING: A Novel Dynamic Search Engine | ||
| Andrew Jie Zhou, Jiyun Luo, Hui Yang | ||
| Pages: 1049-1050 | ||
| doi>10.1145/2766462.2767873 | ||
|
Full text: |
||
|
In this demo paper, we introduce a new search engine that supports Information Retrieval (IR) in a dynamic setting. A dynamic search engine distinguishes itself by handling rich interactions and temporal dependency among the queries in a session or for ...
expand
|
||
| SESSION: Doctoral Consortium | ||
| Promoting User Engagement and Learning in Amorphous Search Tasks | ||
| Piyush Arora | ||
| Pages: 1051-1051 | ||
| doi>10.1145/2766462.2767848 | ||
|
Full text: |
||
|
Much research in information retrieval (IR) focuses on optimization of the rank of relevant retrieval results for single shot ad hoc IR tasks. Relatively little research has been carried out on user engagement to support more complex search tasks. We ...
expand
|
||
| Cross-Platform Question Routing for Better Question Answering | ||
| Mossaab Bagdouri | ||
| Pages: 1053-1053 | ||
| doi>10.1145/2766462.2767849 | ||
|
Full text: |
||
|
The last two decades have seen an increasing interest in the task of question answering (QA). Earlier approaches focused on automated retrieval and extraction models. Recent developments have more focus on community driven QA. This work addresses this ...
expand
|
||
| Time Pressure in Information Search | ||
| Anita Crescenzi | ||
| Pages: 1055-1055 | ||
| doi>10.1145/2766462.2767851 | ||
|
Full text: |
||
|
The primary purpose of this research is to explore the impact of perceived time pressure on search behaviors, searcher perceptions of the search system and the search experience. Are there observable behavioral changes when a searcher is time-pressured? ...
expand
|
||
| Controversy Detection and Stance Analysis | ||
| Shiri Dori-Hacohen | ||
| Pages: 1057-1057 | ||
| doi>10.1145/2766462.2767844 | ||
|
Full text: |
||
|
Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the "filter bubble" effect. Additionally, presenting information to the user about the different stances or sides of the ...
expand
|
||
| Using Contextual Information to Understand Searching and Browsing Behavior | ||
| Julia Kiseleva | ||
| Pages: 1059-1059 | ||
| doi>10.1145/2766462.2767852 | ||
|
Full text: |
||
|
There is great imbalance in the richness of information on the web and the succinctness and poverty of search requests of web users, making their queries only a partial description of the underlying complex information needs. Finding ways to better leverage ...
expand
|
||
| Transfer Learning for Information Retrieval | ||
| Pengfei Li | ||
| Pages: 1061-1061 | ||
| doi>10.1145/2766462.2767845 | ||
|
Full text: |
||
| Enhancing Mathematics Information Retrieval | ||
| Martin Líška | ||
| Pages: 1063-1063 | ||
| doi>10.1145/2766462.2767843 | ||
|
Full text: |
||
| Improving Search using Proximity-Based Statistics | ||
| Xiaolu Lu | ||
| Pages: 1065-1065 | ||
| doi>10.1145/2766462.2767847 | ||
|
Full text: |
||
| Spoken Conversational Search: Information Retrieval over a Speech-only Communication Channel | ||
| Johanne R. Trippas | ||
| Pages: 1067-1067 | ||
| doi>10.1145/2766462.2767850 | ||
|
Full text: |
||
| Finding Answers in Web Search | ||
| Evi Yulianti | ||
| Pages: 1069-1069 | ||
| doi>10.1145/2766462.2767846 | ||
|
Full text: |
||
|
There are many informational queries that could be answered with a text passage, thereby not requiring the searcher to access the full web document. When building manual annotations of answer passages for TREC queries, Keikha et al. [6] confirmed that ...
expand
|
||
| SESSION: Industry Track Preface | ||
| Hang Li, Jaime Teevan | ||
|
Full text: |
||
|
It is our great pleasure to welcome you to the SIGIR Symposium on Information Retrieval in Practice (SIRIP 2015). The goal of SIRIP is to bring together information retrieval researchers, practitioners, analysts, and consumers, and to achieve ...
expand
|
||
| SESSION: Industry Track Invited Talks | ||
| From Web Search Relevance to Vertical Search Relevance | ||
| Yi Chang | ||
| Pages: 1073-1073 | ||
| doi>10.1145/2766462.2776787 | ||
|
Full text: |
||
|
Web search relevance is a billion dollar challenge, while there is a disadvantage of backwardness in web search competition. Vertical search result can be incorporated to enrich web search content, therefore vertical search relevance is critical to provide ...
expand
|
||
| Finding Money in the Haystack: Information Retrieval at Bloomberg | ||
| Jonathan J. Dorando, Konstantine Arkoudas, Parth Vasa, Gary Kazantsev, Gideon Mann | ||
| Pages: 1075-1075 | ||
| doi>10.1145/2766462.2776782 | ||
|
Full text: |
||
|
The financial markets are a rich domain for search, and it is not simple to serving the entire scope of financial professionals, who make their living on accurate, timely, and deep information. The data sources are many and disparate. This includes domains ...
expand
|
||
| If SIGIR had an Academic Track, What Would Be In It? | ||
| David Hawking | ||
| Pages: 1077-1077 | ||
| doi>10.1145/2766462.2776784 | ||
|
Full text: |
||
|
It used to be the case that very little industry research was presented at SIGIR. Now the balance has radically changed -- many accepted papers have industry authors and many rely on industry data sets -- To the extent that a leading academic member ...
expand
|
||
| WeChat Search & Headline: Sogou Joins Force with Tencent on Mobile Search | ||
| Chao Liu | ||
| Pages: 1079-1079 | ||
| doi>10.1145/2766462.2776781 | ||
|
Full text: |
||
|
Tencent Inc. is the biggest social network company in China. Its WeChat and QQ boast of 700 million and 800 million monthly active users (MAU), respectively. Sogou Inc., on the other hand, is a search leader in China, being the No. 2 and No. 3 on mobile/PC ...
expand
|
||
| Structure, Personalization, Scale: A Deep Dive into LinkedIn Search | ||
| Asif Makhani | ||
| Pages: 1081-1081 | ||
| doi>10.1145/2766462.2776785 | ||
|
Full text: |
||
|
All of us are familiar with search as users. And as software engineers, many of us have worked on search problems in the context of web search, site search, or enterprise search. But search at LinkedIn is different. Our corpus is a richly structured ...
expand
|
||
| Location in Search | ||
| Vanessa Murdock | ||
| Pages: 1083-1083 | ||
| doi>10.1145/2766462.2776783 | ||
|
Full text: |
||
|
As users turn increasingly to handheld devices to find information, the research community has focused on real-time location signals (GPS signals) to improve search engine effectiveness. Location signals have been investigated for predicting businesses ...
expand
|
||
| Challenges and Opportunities in Online Evaluation of Search Engines | ||
| Pavel Serdyukov | ||
| Pages: 1085-1085 | ||
| doi>10.1145/2766462.2776786 | ||
|
Full text: |
||
|
Yandex is one of the largest Internet companies in Europe, operating Russia's most popular search engine, generating 58.6\% of all search traffic in Russia (as of April 2015). As all modern search engines, Yandex increasingly relies on online evaluation ...
expand
|
||
| Lower Search Cost | ||
| Dou Shen | ||
| Pages: 1087-1087 | ||
| doi>10.1145/2766462.2776788 | ||
|
Full text: |
||
|
Web search is actually a pretty heavy task for most users since people need to launch a search engine's portal, phrase the right query and then go through search results to find the right information or service. To lower the search cost, commercial search ...
expand
|
||
| SESSION: Industry Track Refereed Papers | ||
| Practical Lessons for Gathering Quality Labels at Scale | ||
| Omar Alonso | ||
| Pages: 1089-1092 | ||
| doi>10.1145/2766462.2776778 | ||
|
Full text: |
||
|
Information retrieval researchers and engineers use human computation as a mechanism to produce labeled data sets for product development, research and experimentation. To gather useful results, a successful labeling task relies on many different elements: ...
expand
|
||
| Incremental Sampling of Query Logs | ||
| Ricardo Baeza-Yates | ||
| Pages: 1093-1096 | ||
| doi>10.1145/2766462.2776780 | ||
|
Full text: |
||
|
We introduce a simple technique to generate incremental query log samples that mimics well the original query distribution. In this way, editorial judgments for new queries can be consistently added to previous judgments. We also review the problem of ...
expand
|
||
| Where to Go on Your Next Trip?: Optimizing Travel Destinations Based on User Preferences | ||
| Julia Kiseleva, Melanie J.I. Mueller, Lucas Bernardi, Chad Davis, Ivan Kovacek, Mats Stafseng Einarsen, Jaap Kamps, Alexander Tuzhilin, Djoerd Hiemstra | ||
| Pages: 1097-1100 | ||
| doi>10.1145/2766462.2776777 | ||
|
Full text: |
||
|
Recommendation based on user preferences is a common task for e-commerce websites. New recommendation algorithms are often evaluated by offline comparison to baseline algorithms such as recommending random or the most popular items. Here, we investigate ...
expand
|
||
| Bringing Order to the Job Market: Efficient Job Offer Categorization in E-Recruitment | ||
| Emmanuel Malherbe, Mario Cataldi, Andrea Ballatore | ||
| Pages: 1101-1104 | ||
| doi>10.1145/2766462.2776779 | ||
|
Full text: |
||
|
E-recruitment uses a range of web-based technologies to find, evaluate, and hire new personnel for organizations. A crucial challenge in this arena lies in the categorization of job offers: candidates and operators often explore and analyze large numbers ...
expand
|
||
| TUTORIAL SESSION: Tutorials | ||
| Yoelle Maarek | ||
|
Full text: |
||
|
This year's conference received twelve submissions, of which eight were accepted, and one was extended to an additional half-day. The decision was based on criteria of relevance to the SIGIR community, core quality and experience of presenters. The accepted ...
expand
|
||
| Building and Using Models of Information Seeking, Search and Retrieval: Full Day Tutorial | ||
| Leif Azzopardi, Guido Zuccon | ||
| Pages: 1107-1110 | ||
| doi>10.1145/2766462.2767874 | ||
|
Full text: |
||
|
Understanding how people interact with information systems when searching is central to the study of Interactive Information Retrieval (IIR). While much of the prior work in this area has either been conceptual, observational or empirical, recently there ...
expand
|
||
| Advanced Click Models and their Applications to IR: SIGIR 2015 Tutorial | ||
| Aleksandr Chuklin, Ilya Markov, Maarten de Rijke | ||
| Pages: 1111-1112 | ||
| doi>10.1145/2766462.2767882 | ||
|
Full text: |
||
|
This tutorial concerns with more advanced and more recent topics in the area of click models. Here, we discuss recent developments in the area with a particular focus on applications of click models. The tutorial features a guest talk and a live demo ...
expand
|
||
| An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial | ||
| Aleksandr Chuklin, Ilya Markov, Maarten de Rijke | ||
| Pages: 1113-1115 | ||
| doi>10.1145/2766462.2767881 | ||
|
Full text: |
||
|
In this introductory tutorial we give an overview of click models for web search. We show how the framework of probabilistic graphical models help to explain user behavior, build new evaluation metrics and perform simulations. The tutorial is augmented ...
expand
|
||
| IR Evaluation: Modeling User Behavior for Measuring Effectiveness | ||
| Charles L.A. Clarke, Mark D. Smucker, Emine Yilmaz | ||
| Pages: 1117-1120 | ||
| doi>10.1145/2766462.2767876 | ||
|
Full text: |
||
|
This half-day tutorial on IR evaluation combines an introduction to classical IR evaluation methods with material on more recent user-oriented approaches. We primarily focus on off-line evaluation, but some material on on-line evaluation is also covered. ...
expand
|
||
| Information Retrieval with Verbose Queries | ||
| Manish Gupta, Michael Bendersky | ||
| Pages: 1121-1124 | ||
| doi>10.1145/2766462.2767877 | ||
|
Full text: |
||
|
Recently, the focus of many novel search applications shifted from short keyword queries to verbose natural language queries. Examples include question answering systems and dialogue systems, voice search on mobile devices and entity search engines like ...
expand
|
||
| Revisiting the Foundations of IR: Timeless, Yet Timely | ||
| Paul B. Kantor | ||
| Pages: 1125-1127 | ||
| doi>10.1145/2766462.2767878 | ||
|
Full text: |
||
|
As we face an explosion of potential new applications for the fundamental concepts and technologies of information retrieval, ranging from ad ranking to social media, from collaborative recommending to question answering systems, many researchers are ...
expand
|
||
| IR Evaluation: Designing an End-to-End Offline Evaluation Pipeline | ||
| Jin Young Kim, Emine Yilmaz | ||
| Pages: 1129-1132 | ||
| doi>10.1145/2766462.2767875 | ||
|
Full text: |
||
|
This tutorial aims to provide attendees with a detailed understanding of end-to-end evaluation pipeline based on human judgments (offline measurement). The tutorial will give an overview of the state of the art methods, techniques, and metrics necessary ...
expand
|
||
| Music Retrieval and Recommendation: A Tutorial Overview | ||
| Peter Knees, Markus Schedl | ||
| Pages: 1133-1136 | ||
| doi>10.1145/2766462.2767880 | ||
|
Full text: |
||
|
In this tutorial, we give an introduction to the field of and state of the art in music information retrieval (MIR). The tutorial particularly spotlights the question of music similarity, which is an essential aspect in music retrieval and recommendation. ...
expand
|
||
| Exploiting Wikipedia for Information Retrieval Tasks | ||
| Bracha Shapira, Nir Ofek, Victor Makarenkov | ||
| Pages: 1137-1140 | ||
| doi>10.1145/2766462.2767879 | ||
|
Full text: |
||
|
Wikipedia - the online encyclopedia - has long been used as a source of information for researchers, as well as being a subject of research itself. Wikipedia has been shown to be effective in recommender systems, sentiment analysis, validation and multiple ...
expand
|
||
| WORKSHOP SESSION: Workshops | ||
| Fernando Diaz, Diane Kelly | ||
|
Full text: |
||
|
We are pleased to introduce the Workshop Program for the 38th Annual SIGIR Conference. We received 14 workshop proposals, each of which was peer-reviewed by three members of the Workshops PC. After discussion of all submissions in the Workshops PC, as ...
expand
|
||
| Web Question Answering: Beyond Factoids: SIGIR 2015 Workshop | ||
| Eugene Agichtein, David Carmel, Charles L.A. Clarke, Praveen Paritosh, Dan Pelleg, Idan Szpektor | ||
| Pages: 1143-1143 | ||
| doi>10.1145/2766462.2767861 | ||
|
Full text: |
||
| Graph Search and Beyond: SIGIR 2015 Workshop Summary | ||
| Omar Alonso, Marti A. Hearst, Jaap Kamps | ||
| Pages: 1145-1146 | ||
| doi>10.1145/2766462.2767855 | ||
|
Full text: |
||
|
Modern Web data is highly structured in terms of entities and relations from large knowledge resources, geo-temporal references and social network structure, resulting in a massive multidimensional graph. This graph essentially unifies both the searcher ...
expand
|
||
| SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR) | ||
| Jaime Arguello, Fernando Diaz, Jimmy Lin, Andrew Trotman | ||
| Pages: 1147-1148 | ||
| doi>10.1145/2766462.2767858 | ||
|
Full text: |
||
| SIGIR 2015 Workshop on Temporal, Social and Spatially-aware Information Access (#TAIA2015) | ||
| Klaus Berberich, James Caverlee, Miles Efron, Claudia Hauff, Vanessa Murdock, Milad Shokouhi, Bart Thomee | ||
| Pages: 1149-1150 | ||
| doi>10.1145/2766462.2767860 | ||
|
Full text: |
||
|
In this workshop we aim to bring together practitioners and researchers to discuss their recent breakthroughs and the challenges with addressing spatial and temporal information access, both from the algorithmic and the architectural perspectives.
expand
|
||
| NeuroIR 2015: Neuro-Physiological Methods in IR Research | ||
| Jacek Gwizdka, Joemon Jose, Javed Mostafa, Max Wilson | ||
| Pages: 1151-1153 | ||
| doi>10.1145/2766462.2767856 | ||
|
Full text: |
||
|
This Tutorial+Workshop will discuss opportunities and challenges involved in using neuro-physiological tools/techniques (such as fMRI, fNIRS, EEG, eye-tracking, GSR, HR, and facial expressions) and theories in information retrieval. The hybrid format ...
expand
|
||
| SPS'15: 2015 International Workshop on Social Personalization & Search | ||
| Christoph Trattner, Denis Parra, Peter Brusilovsky, Leandro Marinho | ||
| Pages: 1155-1155 | ||
| doi>10.1145/2766462.2767859 | ||
|
Full text: |
||
| Privacy-Preserving IR 2015: When Information Retrieval Meets Privacy and Security | ||
| Hui Yang, Ian Soboroff | ||
| Pages: 1157-1158 | ||
| doi>10.1145/2766462.2767857 | ||
|
Full text: |
||
|
Information retrieval (IR) and information privacy/security are two fast-growing computer science disciplines. There are many synergies and connections between these two disciplines. However, there have been very limited efforts to connect the two important ...
expand
|
||
It is my great pleasure to welcome you to Santiago de Chile, my hometown, and to the 38th International SIGIR Conference on Research and Development in Information Retrieval, the premier annual forum for presentations of research in information retrieval (IR) and related topics. This six-day event has a broad technical program that we hope you find interesting, useful, and insightful.
Amazingly, SIGIR 2015 is not only held in my hometown, but also where I did my high school studies. Could I ever imagine that? No way! So in this venue where I have plenty of mixed memories, we start the conference on Sunday with eight great tutorials, thanks to Yoelle Maarek; the Doctoral Consortium, organized by J. Shane Culpepper and Brian Davidson; and we finish the day with the opening reception in the central patio.
The next three days we have the core of the conference, highlighted with the keynotes by Nick Belkin, (Salton Award) on how we interact with information, and ChengXiang Zhai on game theory applied to IR. We also have 70 technical papers selected by the program committee chaired by Mounia Lalmas, Alistair Moffat and Berthier Ribeiro-Neto. Thanks for your great and hard work! We also have 79 short papers and 12 demos presented in a special session before the banquet on Tuesday that will be held at a nearby palace on Cerro Santa Lucía. Thanks to Maarten de Rijke, Ee-Peng Lim and Ryen White for handling the short papers and Djoerd Hiemstra and Mirella Moro for arranging the demonstrations.
On Wednesday we also have the industry track, SIRIP, organized by Hang Li and Jaime Teevan. This track combines eight invited talks and four refereed papers. This day we have been able to keep the number of parallel tracks to just three, and also have the final panel on industry impact from academia, as a single track event. Finally, on Thursday we have seven interesting workshops, thanks to the committee chaired by Fernando Diaz and Diane Kelly.
Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
|
Tools and Resources
Share: |
||||||||||||||||||||||||||||
| SESSION: Athena award lecture | ||
| Marti A. Hearst | ||
| Putting searchers into search | ||
| Susan T. Dumais | ||
| Pages: 1-2 | ||
| doi>10.1145/2600428.2617557 | ||
|
Full text: |
||
|
Over the last two decades the information retrieval landscape has changed dramatically. Twenty years ago, there were fewer than 3k web sites and the earliest web search engines indexed approximately 50k pages. Today, search engines index billions of ...
expand
|
||
| SESSION: Session 1a: risks and rewards | ||
| Diane Kelly | ||
| Modelling interaction with economic models of search | ||
| Leif Azzopardi | ||
| Pages: 3-12 | ||
| doi>10.1145/2600428.2609574 | ||
|
Full text: |
||
|
Understanding how people interact when searching is central to the study of Interactive Information Retrieval (IIR). Most of the prior work has either been conceptual, observational or empirical. While this has led to numerous insights and findings regarding ...
expand
|
||
| Query-performance prediction: setting the expectations straight | ||
| Fiana Raiber, Oren Kurland | ||
| Pages: 13-22 | ||
| doi>10.1145/2600428.2609581 | ||
|
Full text: |
||
|
The query-performance prediction task has been described as estimating retrieval effectiveness in the absence of relevance judgments. The expectations throughout the years were that improved prediction techniques would translate to improved retrieval ...
expand
|
||
| Hypothesis testing for the risk-sensitive evaluation of retrieval systems | ||
| B. Taner Dinçer, Craig Macdonald, Iadh Ounis | ||
| Pages: 23-32 | ||
| doi>10.1145/2600428.2609625 | ||
|
Full text: |
||
|
The aim of risk-sensitive evaluation is to measure when a given information retrieval (IR) system does not perform worse than a corresponding baseline system for any topic. This paper argues that risk-sensitive evaluation is akin to the underlying methodology ...
expand
|
||
| SESSION: Session 1b: #microblog #sigir2014 | ||
| Hang Li | ||
| Temporal feedback for tweet search with non-parametric density estimation | ||
| Miles Efron, Jimmy Lin, Jiyin He, Arjen de Vries | ||
| Pages: 33-42 | ||
| doi>10.1145/2600428.2609575 | ||
|
Full text: |
||
|
This paper investigates the temporal cluster hypothesis: in search tasks where time plays an important role, do relevant documents tend to cluster together in time? We explore this question in the context of tweet search and temporal feedback: starting ...
expand
|
||
| Fine-grained location extraction from tweets with temporal awareness | ||
| Chenliang Li, Aixin Sun | ||
| Pages: 43-52 | ||
| doi>10.1145/2600428.2609582 | ||
|
Full text: |
||
|
Twitter is a popular platform for sharing activities, plans, and opinions. Through tweets, users often reveal their location information and short term visiting plans. In this paper, we are interested in extracting fine-grained locations mentioned in ...
expand
|
||
| Collaborative personalized Twitter search with topic-language models | ||
| Jan Vosecky, Kenneth Wai-Ting Leung, Wilfred Ng | ||
| Pages: 53-62 | ||
| doi>10.1145/2600428.2609584 | ||
|
Full text: |
||
|
The vast amount of real-time and social content in microblogs results in an information overload for users when searching microblog data. Given the user's search query, delivering content that is relevant to her interests is a challenging problem. Traditional ...
expand
|
||
| SESSION: Session 1c: recommendation | ||
| Jamie Callan | ||
| Gaussian process factorization machines for context-aware recommendations | ||
| Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas | ||
| Pages: 63-72 | ||
| doi>10.1145/2600428.2609623 | ||
|
Full text: |
||
|
Context-aware recommendation (CAR) can lead to significant improvements in the relevance of the recommended items by modeling the nuanced ways in which context influences preferences. The dominant approach in context-aware recommendation has been the ...
expand
|
||
| Addressing cold start in recommender systems: a semi-supervised co-training algorithm | ||
| Mi Zhang, Jie Tang, Xuchen Zhang, Xiangyang Xue | ||
| Pages: 73-82 | ||
| doi>10.1145/2600428.2609599 | ||
|
Full text: |
||
|
Cold start is one of the most challenging problems in recommender systems. In this paper we tackle the cold-start problem by proposing a context-aware semi-supervised co-training method named CSEL. Specifically, we use a factorization model to capture ...
expand
|
||
| Explicit factor models for explainable recommendation based on phrase-level sentiment analysis | ||
| Yongfeng Zhang, Guokun Lai, Min Zhang, Yi Zhang, Yiqun Liu, Shaoping Ma | ||
| Pages: 83-92 | ||
| doi>10.1145/2600428.2609579 | ||
|
Full text: |
||
|
Collaborative Filtering(CF)-based recommendation algorithms, such as Latent Factor Models (LFM), work well in terms of prediction accuracy. However, the latent features make it difficulty to explain the recommendation results to the users. Fortunately, ...
expand
|
||
| SESSION: Session 2a: (i can't get no) satisfaction | ||
| Justin Zobel | ||
| Context-aware web search abandonment prediction | ||
| Yang Song, Xiaolin Shi, Ryen White, Ahmed Hassan Awadallah | ||
| Pages: 93-102 | ||
| doi>10.1145/2600428.2609604 | ||
|
Full text: |
||
|
Web search queries without hyperlink clicks are often referred to as abandoned queries. Understanding the reasons for abandonment is crucial for search engines in evaluating their performance. Abandonment can be categorized as good or bad depending on ...
expand
|
||
| Impact of response latency on user behavior in web search | ||
| Ioannis Arapakis, Xiao Bai, B. Barla Cambazoglu | ||
| Pages: 103-112 | ||
| doi>10.1145/2600428.2609627 | ||
|
Full text: |
||
|
Traditionally, the efficiency and effectiveness of search systems have both been of great interest to the information retrieval community. However, an in-depth analysis on the interplay between the response latency of web search systems and users' search ...
expand
|
||
| Towards better measurement of attention and satisfaction in mobile search | ||
| Dmitry Lagun, Chih-Hung Hsieh, Dale Webster, Vidhya Navalpakkam | ||
| Pages: 113-122 | ||
| doi>10.1145/2600428.2609631 | ||
|
Full text: |
||
|
Web Search has seen two big changes recently: rapid growth in mobile search traffic, and an increasing trend towards providing answer-like results for relatively simple information needs (e.g., [weather today]). Such results display the answer or relevant ...
expand
|
||
| Modeling action-level satisfaction for search task satisfaction prediction | ||
| Hongning Wang, Yang Song, Ming-Wei Chang, Xiaodong He, Ahmed Hassan, Ryen W. White | ||
| Pages: 123-132 | ||
| doi>10.1145/2600428.2609607 | ||
|
Full text: |
||
|
Search satisfaction is a property of a user's search process. Understanding it is critical for search providers to evaluate the performance and improve the effectiveness of search engines. Existing methods model search satisfaction holistically at the ...
expand
|
||
| SESSION: Session 2b: doctors and lawyers | ||
| Leif Azzopardi | ||
| Circumlocution in diagnostic medical queries | ||
| Isabelle Stanton, Samuel Ieong, Nina Mishra | ||
| Pages: 133-142 | ||
| doi>10.1145/2600428.2609589 | ||
|
Full text: |
||
|
Circumlocution is when many words are used to describe what could be said with fewer, e.g., "a machine that takes moisture out of the air" instead of "dehumidifier." Web search is a perfect backdrop for circumlocution where people struggle to name what ...
expand
|
||
| Interactions between health searchers and search engines | ||
| Georg P. Schoenherr, Ryen W. White | ||
| Pages: 143-152 | ||
| doi>10.1145/2600428.2609602 | ||
|
Full text: |
||
|
The Web is an important resource for understanding and diagnosing medical conditions. Based on exposure to online content, people may develop undue health concerns, believ- ing that common and benign symptoms are explained by se- rious illnesses. In ...
expand
|
||
| Evaluation of machine-learning protocols for technology-assisted review in electronic discovery | ||
| Gordon V. Cormack, Maura R. Grossman | ||
| Pages: 153-162 | ||
| doi>10.1145/2600428.2609601 | ||
|
Full text: |
||
|
Abstract Using a novel evaluation toolkit that simulates a human reviewer in the loop, we compare the effectiveness of three machine-learning protocols for technology-assisted review as used in document review for discovery in legal proceedings. Our ...
expand
|
||
| ReQ-ReC: high recall retrieval with query pooling and interactive classification | ||
| Cheng Li, Yue Wang, Paul Resnick, Qiaozhu Mei | ||
| Pages: 163-172 | ||
| doi>10.1145/2600428.2609618 | ||
|
Full text: |
||
|
We consider a scenario where a searcher requires both high precision and high recall from an interactive retrieval process. Such scenarios are very common in real life, exemplified by medical search, legal search, market research, and literature review. ...
expand
|
||
| SESSION: Session 2c: hashing and efficiency | ||
| Dawei Song | ||
| Supervised hashing with latent factor models | ||
| Peichao Zhang, Wei Zhang, Wu-Jun Li, Minyi Guo | ||
| Pages: 173-182 | ||
| doi>10.1145/2600428.2609600 | ||
|
Full text: |
||
|
Due to its low storage cost and fast query speed, hashing has been widely adopted for approximate nearest neighbor search in large-scale datasets. Traditional hashing methods try to learn the hash codes in an unsupervised way where the metric (Euclidean) ...
expand
|
||
| Preference preserving hashing for efficient recommendation | ||
| Zhiwei Zhang, Qifan Wang, Lingyun Ruan, Luo Si | ||
| Pages: 183-192 | ||
| doi>10.1145/2600428.2609578 | ||
|
Full text: |
||
|
Recommender systems usually need to compare a large number of items before users' most preferred ones can be found This process can be very costly if recommendations are frequently made on large scale datasets. In this paper, a novel hashing algorithm, ...
expand
|
||
| Load balancing for partition-based similarity search | ||
| Xun Tang, Maha Alabduljalil, Xin Jin, Tao Yang | ||
| Pages: 193-202 | ||
| doi>10.1145/2600428.2609624 | ||
|
Full text: |
||
|
All pairs similarity search, used in many data mining and information retrieval applications, is a time consuming process. Although a partition-based approach accelerates this process by simplifying parallelism management and avoiding unnecessary I/O ...
expand
|
||
| Estimating global statistics for unstructured P2P search in the presence of adversarial peers | ||
| Sami Richardson, Ingemar J. Cox | ||
| Pages: 203-212 | ||
| doi>10.1145/2600428.2609567 | ||
|
Full text: |
||
|
A common problem in unstructured peer-to-peer (P2P) information retrieval is the need to compute global statistics of the full collection, when only a small subset of the collection is visible to a peer. Without accurate estimates of these statistics, ...
expand
|
||
| SESSION: Session 3a: Social media | ||
| Hui Fang | ||
| Hierarchical multi-label classification of social text streams | ||
| Zhaochun Ren, Maria-Hendrike Peetz, Shangsong Liang, Willemijn van Dolen, Maarten de Rijke | ||
| Pages: 213-222 | ||
| doi>10.1145/2600428.2609595 | ||
|
Full text: |
||
|
Hierarchical multi-label classification assigns a document to multiple hierarchical classes. In this paper we focus on hierarchical multi-label classification of social text streams. Concept drift, complicated relations among classes, and the limited ...
expand
|
||
| An adaptive teleportation random walk model for learning social tag relevance | ||
| Xiaofei Zhu, Wolfgang Nejdl, Mihai Georgescu | ||
| Pages: 223-232 | ||
| doi>10.1145/2600428.2609556 | ||
|
Full text: |
||
|
Social tags are known to be a valuable source of information for image retrieval and organization. However, contrary to the conventional document retrieval, rich tag frequency information in social sharing systems, such as Flickr, is not available, thus ...
expand
|
||
| Predicting the popularity of web 2.0 items based on user comments | ||
| Xiangnan He, Ming Gao, Min-Yen Kan, Yiqun Liu, Kazunari Sugiyama | ||
| Pages: 233-242 | ||
| doi>10.1145/2600428.2609558 | ||
|
Full text: |
||
|
In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for ...
expand
|
||
| Recommending social media content to community owners | ||
| Inbal Ronen, Ido Guy, Elad Kravi, Maya Barnea | ||
| Pages: 243-252 | ||
| doi>10.1145/2600428.2609596 | ||
|
Full text: |
||
|
Online communities within the enterprise offer their leaders an easy and accessible way to attract, engage, and influence others. Our research studies the recommendation of social media content to leaders (owners) of online communities within the enterprise. ...
expand
|
||
| SESSION: Session 3b: indexing and efficiency | ||
| Alistair Moffat | ||
| Predictive parallelization: taming tail latencies in web search | ||
| Myeongjae Jeon, Saehoon Kim, Seung-won Hwang, Yuxiong He, Sameh Elnikety, Alan L. Cox, Scott Rixner | ||
| Pages: 253-262 | ||
| doi>10.1145/2600428.2609572 | ||
|
Full text: |
||
|
Web search engines are optimized to reduce the high-percentile response time to consistently provide fast responses to almost all user queries. This is a challenging task because the query workload exhibits large variability, consisting of many short-running ...
expand
|
||
| Skewed partial bitvectors for list intersection | ||
| Andrew Kane, Frank Wm. Tompa | ||
| Pages: 263-272 | ||
| doi>10.1145/2600428.2609609 | ||
|
Full text: |
||
|
This paper examines the space-time performance of in-memory conjunctive list intersection algorithms, as used in search engines, where integers represent document identifiers. We demonstrate that the combination of bitvectors, large skips, delta compressed ...
expand
|
||
| Partitioned Elias-Fano indexes | ||
| Giuseppe Ottaviano, Rossano Venturini | ||
| Pages: 273-282 | ||
| doi>10.1145/2600428.2609615 | ||
|
Full text: |
||
|
The Elias-Fano representation of monotone sequences has been recently applied to the compression of inverted indexes, showing excellent query performance thanks to its efficient random access and search operations. While its space occupancy is ...
expand
|
||
| Principled dictionary pruning for low-memory corpus compression | ||
| Jiancong Tong, Anthony Wirth, Justin Zobel | ||
| Pages: 283-292 | ||
| doi>10.1145/2600428.2609576 | ||
|
Full text: |
||
|
Compression of collections, such as text databases, can both reduce space consumption and increase retrieval efficiency, through better caching and better exploitation of the memory hierarchy. A promising technique is relative Lempel-Ziv coding, in which ...
expand
|
||
| SESSION: Session 3c: e pluribus unum | ||
| Bruce Croft | ||
| Learning for search result diversification | ||
| Yadong Zhu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng, Shuzi Niu | ||
| Pages: 293-302 | ||
| doi>10.1145/2600428.2609634 | ||
|
Full text: |
||
|
Search result diversification has gained attention as a way to tackle the ambiguous or multi-faceted information needs of users. Most existing methods on this problem utilize a heuristic predefined ranking function, where limited features can be incorporated ...
expand
|
||
| Fusion helps diversification | ||
| Shangsong Liang, Zhaochun Ren, Maarten de Rijke | ||
| Pages: 303-312 | ||
| doi>10.1145/2600428.2609561 | ||
|
Full text: |
||
|
A popular strategy for search result diversification is to first retrieve a set of documents utilizing a standard retrieval method and then rerank the results. We adopt a different perspective on the problem, based on data fusion. Starting from the hypothesis ...
expand
|
||
| Utilizing relevance feedback in fusion-based retrieval | ||
| Ella Rabinovich, Ofri Rom, Oren Kurland | ||
| Pages: 313-322 | ||
| doi>10.1145/2600428.2609573 | ||
|
Full text: |
||
|
Work on using relevance feedback for retrieval has focused on the single retrieved list setting. That is, an initial document list is retrieved in response to the query and feedback for the most highly ranked documents is used to perform a second search. ...
expand
|
||
| A simple term frequency transformation model for effective pseudo relevance feedback | ||
| Zheng Ye, Jimmy Xiangji Huang | ||
| Pages: 323-332 | ||
| doi>10.1145/2600428.2609636 | ||
|
Full text: |
||
|
Pseudo Relevance Feedback is an effective technique to improve the performance of ad-hoc information retrieval. Traditionally, the expansion terms are extracted either according to the term distributions in the feedback documents; or according to both ...
expand
|
||
| SESSION: Plenary address | ||
| Shlomo Geva | ||
| Seeking simplicity in search user interfaces | ||
| Marti A. Hearst | ||
| Pages: 333-334 | ||
| doi>10.1145/2600428.2617558 | ||
|
Full text: |
||
|
It is rare for a new user interface to break through and become successful, especially in information-intensive tasks like search, coming to consensus or building up knowledge. Most complex interfaces end up going unused. Often the successful solution ...
expand
|
||
| SESSION: Session 4a: think globally, act locally | ||
| Matt Lease | ||
| Who is the barbecue king of texas?: a geo-spatial approach to finding local experts on twitter | ||
| Zhiyuan Cheng, James Caverlee, Himanshu Barthwal, Vandana Bachani | ||
| Pages: 335-344 | ||
| doi>10.1145/2600428.2609580 | ||
|
Full text: |
||
|
This paper addresses the problem of identifying local experts in social media systems like Twitter. Local experts -- in contrast to general topic experts -- have specialized knowledge focused around a particular location, and are important for many applications ...
expand
|
||
| Your neighbors affect your ratings: on geographical neighborhood influence to rating prediction | ||
| Longke Hu, Aixin Sun, Yong Liu | ||
| Pages: 345-354 | ||
| doi>10.1145/2600428.2609593 | ||
|
Full text: |
||
|
Rating prediction is to predict the preference rating of a user to an item that she has not rated before. Using the business review data from Yelp, in this paper, we study business rating prediction. A business here can be a restaurant, a shopping mall ...
expand
|
||
| Processing spatial keyword query as a top-k aggregation query | ||
| Dongxiang Zhang, Chee-Yong Chan, Kian-Lee Tan | ||
| Pages: 355-364 | ||
| doi>10.1145/2600428.2609562 | ||
|
Full text: |
||
|
We examine the spatial keyword search problem to retrieve objects of interest that are ranked based on both their spatial proximity to the query location as well as the textual relevance of the object's keywords. Existing solutions for the problem are ...
expand
|
||
| SESSION: Session 4b: scientia potentia est | ||
| Isabelle Moulinier | ||
| Entity query feature expansion using knowledge base links | ||
| Jeffrey Dalton, Laura Dietz, James Allan | ||
| Pages: 365-374 | ||
| doi>10.1145/2600428.2609628 | ||
|
Full text: |
||
|
Recent advances in automatic entity linking and knowledge base construction have resulted in entity annotations for document and query collections. For example, annotations of entities from large general purpose knowledge bases, such as Freebase and ...
expand
|
||
| QUADS: question answering for decision support | ||
| Zi Yang, Ying Li, James Cai, Eric Nyberg | ||
| Pages: 375-384 | ||
| doi>10.1145/2600428.2609606 | ||
|
Full text: |
||
|
As the scale of available on-line data grows ever larger, individuals and businesses must cope with increasing complexity in decision-making processes which utilize large volumes of unstructured, semi-structured and/or structured data to satisfy multiple, ...
expand
|
||
| Topic labeled text classification: a weakly supervised approach | ||
| Swapnil Hingmire, Sutanu Chakraborti | ||
| Pages: 385-394 | ||
| doi>10.1145/2600428.2609565 | ||
|
Full text: |
||
|
Supervised text classifiers require extensive human expertise and labeling efforts. In this paper, we propose a weakly supervised text classification algorithm based on the labeling of Latent Dirichlet Allocation (LDA) topics. Our algorithm is based ...
expand
|
||
| SESSION: Session 4c: more hashing | ||
| Mark Sanderson | ||
| Discriminative coupled dictionary hashing for fast cross-media retrieval | ||
| Zhou Yu, Fei Wu, Yi Yang, Qi Tian, Jiebo Luo, Yueting Zhuang | ||
| Pages: 395-404 | ||
| doi>10.1145/2600428.2609563 | ||
|
Full text: |
||
|
Cross-media hashing, which conducts cross-media retrieval by embedding data from different modalities into a common low-dimensional Hamming space, has attracted intensive attention in recent years. The existing cross-media hashing approaches only aim ...
expand
|
||
| Active hashing with joint data example and tag selection | ||
| Qifan Wang, Luo Si, Zhiwei Zhang, Ning Zhang | ||
| Pages: 405-414 | ||
| doi>10.1145/2600428.2609590 | ||
|
Full text: |
||
|
Similarity search is an important problem in many large scale applications such as image and text retrieval. Hashing method has become popular for similarity search due to its fast search speed and low storage cost. Recent research has shown that hashing ...
expand
|
||
| Latent semantic sparse hashing for cross-modal similarity search | ||
| Jile Zhou, Guiguang Ding, Yuchen Guo | ||
| Pages: 415-424 | ||
| doi>10.1145/2600428.2609610 | ||
|
Full text: |
||
|
Similarity search methods based on hashing for effective and efficient cross-modal retrieval on large-scale multimedia databases with massive text and images have attracted considerable attention. The core problem of cross-modal hashing is how to effectively ...
expand
|
||
| SESSION: Session 5a: brains!!! | ||
| Mark Smucker | ||
| Predicting term-relevance from brain signals | ||
| Manuel J.A. Eugster, Tuukka Ruotsalo, Michiel M. Spapé, Ilkka Kosunen, Oswald Barral, Niklas Ravaja, Giulio Jacucci, Samuel Kaski | ||
| Pages: 425-434 | ||
| doi>10.1145/2600428.2609594 | ||
|
Full text: |
||
|
Term-Relevance Prediction from Brain Signals (TRPB) is proposed to automatically detect relevance of text information directly from brain signals. An experiment with forty participants was conducted to record neural activity of participants while providing ...
expand
|
||
| Multidimensional relevance modeling via psychometrics and crowdsourcing | ||
| Yinglong Zhang, Jin Zhang, Matthew Lease, Jacek Gwizdka | ||
| Pages: 435-444 | ||
| doi>10.1145/2600428.2609577 | ||
|
Full text: |
||
|
While many multidimensional models of relevance have been posited, prior studies have been largely exploratory rather than confirmatory. Lacking a methodological framework to quantify the relationships among factors or measure model fit to observed data, ...
expand
|
||
| SESSION: Session 5b0: auto-completio | ||
| Jimmy Lin | ||
| Learning user reformulation behavior for query auto-completion | ||
| Jyun-Yu Jiang, Yen-Yu Ke, Pao-Yu Chien, Pu-Jen Cheng | ||
| Pages: 445-454 | ||
| doi>10.1145/2600428.2609614 | ||
|
Full text: |
||
|
It is crucial for query auto-completion to accurately predict what a user is typing. Given a query prefix and its context (e.g., previous queries), conventional context-aware approaches often produce relevant queries to the context. The purpose of this ...
expand
|
||
| A two-dimensional click model for query auto-completion | ||
| Yanen Li, Anlei Dong, Hongning Wang, Hongbo Deng, Yi Chang, ChengXiang Zhai | ||
| Pages: 455-464 | ||
| doi>10.1145/2600428.2609571 | ||
|
Full text: |
||
|
Query auto-completion (QAC) facilitates faster user query input by predicting users' intended queries. Most QAC algorithms take a learning-based approach to incorporate various signals for query relevance prediction. However, such models are trained ...
expand
|
||
| SESSION: Session 5b1: how to win friends and influence people | ||
| Jimmy Lin | ||
| On measuring social friend interest similarities in recommender systems | ||
| Hao Ma | ||
| Pages: 465-474 | ||
| doi>10.1145/2600428.2609635 | ||
|
Full text: |
||
|
Social recommender system has become an emerging research topic due to the prevalence of online social networking services during the past few years. In this paper, aiming at providing fundamental support to the research of social recommendation problem, ...
expand
|
||
| IMRank: influence maximization via finding self-consistent ranking | ||
| Suqi Cheng, Huawei Shen, Junming Huang, Wei Chen, Xueqi Cheng | ||
| Pages: 475-484 | ||
| doi>10.1145/2600428.2609592 | ||
|
Full text: |
||
|
Influence maximization, fundamental for word-of-mouth marketing and viral marketing, aims to find a set of seed nodes maximizing influence spread on social network. Early methods mainly fall into two paradigms with certain benefits and drawbacks: (1) ...
expand
|
||
| SESSION: Session 5c: collaborative complex personalization | ||
| Jimmy Huang | ||
| User-driven system-mediated collaborative information retrieval | ||
| Laure Soulier, Chirag Shah, Lynda Tamine | ||
| Pages: 485-494 | ||
| doi>10.1145/2600428.2609598 | ||
|
Full text: |
||
|
Most of the previous approaches surrounding collaborative information retrieval (CIR) provide either a user-based mediation, in which the system only supports users' collaborative activities, or a system-based mediation, in which the system plays an ...
expand
|
||
| SearchPanel: framing complex search needs | ||
| Pernilla Qvarfordt, Simon Tretter, Gene Golovchinsky, Tony Dunnigan | ||
| Pages: 495-504 | ||
| doi>10.1145/2600428.2609620 | ||
|
Full text: |
||
|
People often use more than one query when searching for information. They revisit search results to re-find information and build an understanding of their search need through iterative explorations of query formulation. These tasks are not well-supported ...
expand
|
||
| Cohort modeling for enhanced personalized search | ||
| Jinyun Yan, Wei Chu, Ryen W. White | ||
| Pages: 505-514 | ||
| doi>10.1145/2600428.2609617 | ||
|
Full text: |
||
|
Web search engines utilize behavioral signals to develop search experiences tailored to individual users. To be effective, such personalization relies on access to sufficient information about each user's interests and intentions. For new users or new ...
expand
|
||
| Characterizing multi-click search behavior and the risks and opportunities of changing results during use | ||
| Chia-Jung Lee, Jaime Teevan, Sebastian de la Chica | ||
| Pages: 515-524 | ||
| doi>10.1145/2600428.2609588 | ||
|
Full text: |
||
|
Although searchers often click on more than one result following a query, little is known about how they interact with search results after their first click. Using large scale query log analysis, we characterize what people do when they return to a ...
expand
|
||
| SESSION: Plenary address | ||
| Andrew Trotman | ||
| The data revolution: how companies are transforming with big data | ||
| Hugh E. Williams | ||
| Pages: 525-526 | ||
| doi>10.1145/2600428.2617559 | ||
|
Full text: |
||
|
Spelling correction in the 1990s was all about algorithms and small dictionaries. This century, it is about mining vast data sets of past user behaviors, simple algorithms, and using those to correct mistakes. The large Internet giants are data-driven ...
expand
|
||
| SESSION: Session 6a: #moremicroblog #sigir2014 | ||
| ChengXiang Zhai | ||
| Learning similarity functions for topic detection in online reputation monitoring | ||
| Damiano Spina, Julio Gonzalo, Enrique Amigó | ||
| Pages: 527-536 | ||
| doi>10.1145/2600428.2609621 | ||
|
Full text: |
||
|
Reputation management experts have to monitor--among others--Twitter constantly and decide, at any given time, what is being said about the entity of interest (a company, organization, personality...). Solving this reputation monitoring problem automatically ...
expand
|
||
| Predicting trending messages and diffusion participants in microblogging network | ||
| Jingwen Bian, Yang Yang, Tat-Seng Chua | ||
| Pages: 537-546 | ||
| doi>10.1145/2600428.2609616 | ||
|
Full text: |
||
|
Microblogging services have emerged as an essential way to strengthen the communications among individuals. One of the most important features of microblog over traditional social networks is the extensive proliferation in information diffusion. As the ...
expand
|
||
| Leveraging knowledge across media for spammer detection in microblogging | ||
| Xia Hu, Jiliang Tang, Huan Liu | ||
| Pages: 547-556 | ||
| doi>10.1145/2600428.2609632 | ||
|
Full text: |
||
|
While microblogging has emerged as an important information sharing and communication platform, it has also become a convenient venue for spammers to overwhelm other users with unwanted content. Currently, spammer detection in microblogging focuses on ...
expand
|
||
| SESSION: Session 6b: scents and sensibility | ||
| Doug Oard | ||
| Using information scent and need for cognition to understand online search behavior | ||
| Wan-Ching Wu, Diane Kelly, Avneesh Sud | ||
| Pages: 557-566 | ||
| doi>10.1145/2600428.2609626 | ||
|
Full text: |
||
|
The purpose of this study is to investigate the extent to which two theories, Information Scent and Need for Cognition, explain people's search behaviors when interacting with search engine results pages (SERPs). Information Scent, the perception of ...
expand
|
||
| Discrimination between tasks with user activity patterns during information search | ||
| Michael J. Cole, Chathra Hendahewa, Nicholas J. Belkin, Chirag Shah | ||
| Pages: 567-576 | ||
| doi>10.1145/2600428.2609591 | ||
|
Full text: |
||
|
Can the activity patterns of page use during information search sessions discriminate between different types of information seeking tasks? We model sequences of interactions with search result and content pages during information search sessions. Two ...
expand
|
||
| Investigating users' query formulations for cognitive search intents | ||
| Makoto P. Kato, Takehiro Yamamoto, Hiroaki Ohshima, Katsumi Tanaka | ||
| Pages: 577-586 | ||
| doi>10.1145/2600428.2609566 | ||
|
Full text: |
||
|
This study investigated query formulations by users with {\it Cognitive Search Intents} (CSIs), which are users' needs for the cognitive characteristics of documents to be retrieved, {\em e.g. comprehensibility, subjectivity, and concreteness. Our four ...
expand
|
||
| SESSION: Session 6c: users vs. models | ||
| Ricardo Baeza-Yates | ||
| Win-win search: dual-agent stochastic game in session search | ||
| Jiyun Luo, Sicong Zhang, Hui Yang | ||
| Pages: 587-596 | ||
| doi>10.1145/2600428.2609629 | ||
|
Full text: |
||
|
Session search is a complex search task that involves multiple search iterations triggered by query reformulations. We observe a Markov chain in session search: user's judgment of retrieved documents in the previous search iteration affects user's actions ...
expand
|
||
| Injecting user models and time into precision via Markov chains | ||
| Marco Ferrante, Nicola Ferro, Maria Maistro | ||
| Pages: 597-606 | ||
| doi>10.1145/2600428.2609637 | ||
|
Full text: |
||
|
We propose a family of new evaluation measures, called Markov Precision (MP), which exploits continuous-time and discrete-time Markov chains in order to inject user models into precision. Continuous-time MP behaves like time-calibrated measures, bringing ...
expand
|
||
| Searching, browsing, and clicking in a search session: changes in user behavior by task and over time | ||
| Jiepu Jiang, Daqing He, James Allan | ||
| Pages: 607-616 | ||
| doi>10.1145/2600428.2609633 | ||
|
Full text: |
||
|
There are many existing studies of user behavior in simple tasks (e.g., navigational and informational search) within a short duration of 1--2 queries. However, we know relatively little about user behavior, especially browsing and clicking behavior, ...
expand
|
||
| SESSION: Session 7a: sentiments | ||
| Kevyn Collins-Thompson | ||
| Coarse-to-fine review selection via supervised joint aspect and sentiment model | ||
| Zhen Hai, Gao Cong, Kuiyu Chang, Wenting Liu, Peng Cheng | ||
| Pages: 617-626 | ||
| doi>10.1145/2600428.2609570 | ||
|
Full text: |
||
|
Online reviews are immensely valuable for customers to make informed purchase decisions and for businesses to improve the quality of their products and services. However, customer reviews grow exponentially while varying greatly in quality. It is generally ...
expand
|
||
| Cross-domain and cross-category emotion tagging for comments of online news | ||
| Ying Zhang, Ning Zhang, Luo Si, Yanshan Lu, Qifan Wang, Xiaojie Yuan | ||
| Pages: 627-636 | ||
| doi>10.1145/2600428.2609587 | ||
|
Full text: |
||
|
In many online news services, users often write comments towards news in subjective emotions such as sadness, happiness or anger. Knowing such emotions can help understand the preferences and perspectives of individual users, and therefore may facilitate ...
expand
|
||
| Economically-efficient sentiment stream analysis | ||
| Roberto Lourenco Jr., Adriano Veloso, Adriano Pereira, Wagner Meira Jr., Renato Ferreira, Srinivasan Parthasarathy | ||
| Pages: 637-646 | ||
| doi>10.1145/2600428.2609612 | ||
|
Full text: |
||
|
Text-based social media channels, such as Twitter, produce torrents of opinionated data about the most diverse topics and entities. The analysis of such data (aka. sentiment analysis) is quickly becoming a key feature in recommender systems and search ...
expand
|
||
| SESSION: Session 7b: more like those | ||
| Yi Zhang | ||
| New and improved: modeling versions to improve app recommendation | ||
| Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua | ||
| Pages: 647-656 | ||
| doi>10.1145/2600428.2609560 | ||
|
Full text: |
||
|
Existing recommender systems usually model items as static -- unchanging in attributes, description, and features. However, in domains such as mobile apps, a version update may provide substantial changes to an app as updates, reflected by an increment ...
expand
|
||
| Bundle recommendation in ecommerce | ||
| Tao Zhu, Patrick Harrington, Junjun Li, Lei Tang | ||
| Pages: 657-666 | ||
| doi>10.1145/2600428.2609603 | ||
|
Full text: |
||
|
Recommender system has become an important component in modern eCommerce. Recent research on recommender systems has been mainly concentrating on improving the relevance or profitability of individual recommended items. But in reality, users are usually ...
expand
|
||
| Does product recommendation meet its waterloo in unexplored categories?: no, price comes to help | ||
| Jia Chen, Qin Jin, Shiwan Zhao, Shenghua Bao, Li Zhang, Zhong Su, Yong Yu | ||
| Pages: 667-676 | ||
| doi>10.1145/2600428.2609608 | ||
|
Full text: |
||
|
State-of-the-art methods for product recommendation encounter significant performance drop in categories where a user has no purchase history. This problem needs to be addressed since current online retailers are moving beyond single category and attempting ...
expand
|
||
| SESSION: Session 7c: signs and symbols | ||
| Jaap Kamps | ||
| Query expansion for mixed-script information retrieval | ||
| Parth Gupta, Kalika Bali, Rafael E. Banchs, Monojit Choudhury, Paolo Rosso | ||
| Pages: 677-686 | ||
| doi>10.1145/2600428.2609622 | ||
|
Full text: |
||
|
For many languages that use non-Roman based indigenous scripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in the Roman script. Such content creates a monolingual or ...
expand
|
||
| Retrieval of similar chess positions | ||
| Debasis Ganguly, Johannes Leveling, Gareth J.F. Jones | ||
| Pages: 687-696 | ||
| doi>10.1145/2600428.2609605 | ||
|
Full text: |
||
|
We address the problem of retrieving chess game positions similar to a given query position from a collection of archived chess games. We investigate this problem from an information retrieval (IR) perspective. The advantage of our proposed IR-based ...
expand
|
||
| A mathematics retrieval system for formulae in layout presentations | ||
| Xiaoyan Lin, Liangcai Gao, Xuan Hu, Zhi Tang, Yingnan Xiao, Xiaozhong Liu | ||
| Pages: 697-706 | ||
| doi>10.1145/2600428.2609611 | ||
|
Full text: |
||
|
The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, and Presentation MathML, which challenges previous text index and retrieval methods. This paper proposes an innovative ...
expand
|
||
| SESSION: Session 8a: picture this | ||
| Grace Hui Yang | ||
| The knowing camera 2: recognizing and annotating places-of-interest in smartphone photos | ||
| Pai Peng, Lidan Shou, Ke Chen, Gang Chen, Sai Wu | ||
| Pages: 707-716 | ||
| doi>10.1145/2600428.2609557 | ||
|
Full text: |
||
|
This paper presents a project called Knowing Camera for real-time recognizing and annotating places-of-interest(POI) in smartphone photos, with the availability of online geotagged images of such places. We propose a`"Spatial+Visual" (S+V) framework ...
expand
|
||
| Click-through-based cross-view learning for image search | ||
| Yingwei Pan, Ting Yao, Tao Mei, Houqiang Li, Chong-Wah Ngo, Yong Rui | ||
| Pages: 717-726 | ||
| doi>10.1145/2600428.2609568 | ||
|
Full text: |
||
|
One of the fundamental problems in image search is to rank image documents according to a given textual query. Existing search engines highly depend on surrounding texts for ranking images, or leverage the query-image pairs annotated by human labelers ...
expand
|
||
| Learning to personalize trending image search suggestion | ||
| Chun-Che Wu, Tao Mei, Winston H. Hsu, Yong Rui | ||
| Pages: 727-736 | ||
| doi>10.1145/2600428.2609569 | ||
|
Full text: |
||
|
Trending search suggestion is leading a new paradigm of image search, where user's exploratory search experience is facilitated with the automatic suggestion of trending queries. Existing image search engines, however, only provide general suggestions ...
expand
|
||
| PRISM: concept-preserving social image search results summarization | ||
| Boon-Siew Seah, Sourav S. Bhowmick, Aixin Sun | ||
| Pages: 737-746 | ||
| doi>10.1145/2600428.2609586 | ||
|
Full text: |
||
|
Most existing tag-based social image search engines present search results as a ranked list of images, which cannot be consumed by users in a natural and intuitive manner. In this paper, we present a novel concept-preserving image search results summarization ...
expand
|
||
| SESSION: Session 8b: time and tide | ||
| Oren Kurland | ||
| Time-critical search | ||
| Nina Mishra, Ryen W. White, Samuel Ieong, Eric Horvitz | ||
| Pages: 747-756 | ||
| doi>10.1145/2600428.2609613 | ||
|
Full text: |
||
|
We study time-critical search, where users have urgent information needs in the context of an acute problem. As examples, users may need to know how to stem a severe bleed, help a baby who is choking on a foreign object, or respond to an epileptic seizure. ...
expand
|
||
| Learning temporal-dependent ranking models | ||
| Miguel Costa, Francisco Couto, Mário Silva | ||
| Pages: 757-766 | ||
| doi>10.1145/2600428.2609619 | ||
|
Full text: |
||
|
Web archives already hold together more than 534 billion files and this number continues to grow as new initiatives arise. Searching on all versions of these files acquired throughout time is challenging, since users expect as fast and precise answers ...
expand
|
||
| Web page segmentation with structured prediction and its application in web page classification | ||
| Lidong Bing, Rui Guo, Wai Lam, Zheng-Yu Niu, Haifeng Wang | ||
| Pages: 767-776 | ||
| doi>10.1145/2600428.2609630 | ||
|
Full text: |
||
|
We propose a framework which can perform Web page segmentation with a structured prediction approach. It formulates the segmentation task as a structured labeling problem on a transformed Web page segmentation graph (WPS-graph). WPS-graph models the ...
expand
|
||
| Query log driven web search results clustering | ||
| Jose G. Moreno, Gaël Dias, Guillaume Cleuziou | ||
| Pages: 777-786 | ||
| doi>10.1145/2600428.2609583 | ||
|
Full text: |
||
|
Different important studies in Web search results clustering have recently shown increasing performances motivated by the use of external resources. Following this trend, we present a new algorithm called Dual C-Means, which provides a theoretical background ...
expand
|
||
| SESSION: Session 8c0: summaries and semantics | ||
| Paul Bennett | ||
| CTSUM: extracting more certain summaries for news articles | ||
| Xiaojun Wan, Jianmin Zhang | ||
| Pages: 787-796 | ||
| doi>10.1145/2600428.2609559 | ||
|
Full text: |
||
|
People often read summaries of news articles in order to get reliable information about an event or a topic. However, the information expressed in news articles is not always certain, and some sentences contain uncertain information about the event. ...
expand
|
||
| Continuous word embeddings for detecting local text reuses at the semantic level | ||
| Qi Zhang, Jihua Kang, Jin Qian, Xuanjing Huang | ||
| Pages: 797-806 | ||
| doi>10.1145/2600428.2609597 | ||
|
Full text: |
||
|
Text reuse is a common phenomenon in a variety of user-generated content. Along with the quick expansion of social media, reuses of local text are occurring much more frequently than ever before. The task of detecting these local reuses serves as an ...
expand
|
||
| SESSION: Session 8C1: [citation] recommendation | ||
| Paul Bennett | ||
| CiteSight: supporting contextual citation recommendation using differential search | ||
| Avishay Livne, Vivek Gokuladas, Jaime Teevan, Susan T. Dumais, Eytan Adar | ||
| Pages: 807-816 | ||
| doi>10.1145/2600428.2609585 | ||
|
Full text: |
||
|
A person often uses a single search engine for very different tasks. For example, an author editing a manuscript may use the same academic search engine to find the latest work on a particular topic or to find the correct citation for a familiar article. ...
expand
|
||
| Cross-language context-aware citation recommendation in scientific articles | ||
| Xuewei Tang, Xiaojun Wan, Xun Zhang | ||
| Pages: 817-826 | ||
| doi>10.1145/2600428.2609564 | ||
|
Full text: |
||
|
Adequacy of citations is very important for a scientific paper. However, it is not an easy job to find appropriate citations for a given context, especially for citations in different languages. In this paper, we define a novel task of cross-language ...
expand
|
||
| POSTER SESSION: Poster session (short papers) | ||
| Search result diversification via data fusion | ||
| Shengli Wu, Chunlan Huang | ||
| Pages: 827-830 | ||
| doi>10.1145/2600428.2609451 | ||
|
Full text: |
||
|
In recent years, researchers have investigated search result diversification through a variety of approaches. In such situations, information retrieval systems need to consider both aspects of relevance and diversity for those retrieved documents. On ...
expand
|
||
| Hashtag recommendation for hyperlinked tweets | ||
| Surendra Sedhai, Aixin Sun | ||
| Pages: 831-834 | ||
| doi>10.1145/2600428.2609452 | ||
|
Full text: |
||
|
Presence of hyperlink in a tweet is a strong indication of tweet being more informative. In this paper, we study the problem of hashtag recommendation for hyperlinked tweets (i.e., tweets containing links to Web pages). By recommending hashtags to hyperlinked ...
expand
|
||
| Personalized document re-ranking based on Bayesian probabilistic matrix factorization | ||
| Fei Cai, Shangsong Liang, Maarten de Rijke | ||
| Pages: 835-838 | ||
| doi>10.1145/2600428.2609453 | ||
|
Full text: |
||
|
A query considered in isolation provides limited information about the searcher's interest. Previous work has considered various types of user behavior, e.g., clicks and dwell time, to obtain a better understanding of the user's intent. We consider the ...
expand
|
||
| Using the cross-entropy method to re-rank search results | ||
| Haggai Roitman, Shay Hummel, Oren Kurland | ||
| Pages: 839-842 | ||
| doi>10.1145/2600428.2609454 | ||
|
Full text: |
||
|
We present a novel unsupervised approach to re-ranking an initially retrieved list. The approach is based on the Cross Entropy method applied to permutations of the list, and relies on performance prediction. Using pseudo predictors we establish a lower ...
expand
|
||
| Computing and applying topic-level user interactions in microblog recommendation | ||
| Xiao Lu, Peng Li, Hongyuan Ma, Shuxin Wang, Anying Xu, Bin Wang | ||
| Pages: 843-846 | ||
| doi>10.1145/2600428.2609455 | ||
|
Full text: |
||
|
With the development of microblog services, tens of thousands of messages are produced every day and recommending useful messages according to users' interest is recognized as an effective way to overcome the information overload problem. Collaborative ...
expand
|
||
| Towards context-aware search with right click | ||
| Aixin Sun, Chii-Hian Lou | ||
| Pages: 847-850 | ||
| doi>10.1145/2600428.2609456 | ||
|
Full text: |
||
|
Many queries are submitted to search engines by right-clicking the marked text (i.e., the query) in Web browsers. Because the document being read by the searcher often provides sufficient contextual information for the query, search engine could provide ...
expand
|
||
| Rendering expressions to improve accuracy of relevance assessment for math search | ||
| Matthias S. Reichenbach, Anurag Agarwal, Richard Zanibbi | ||
| Pages: 851-854 | ||
| doi>10.1145/2600428.2609457 | ||
|
Full text: |
||
|
Finding ways to help users assess relevance when they search using math expressions is critical for making Mathematical Information Retrieval (MIR) systems easier to use. We designed a study where participants completed search tasks involving mathematical ...
expand
|
||
| Exploring recommendations in internet of things | ||
| Lina Yao, Quan Z. Sheng, Anne H.H. Ngu, Helen Ashman, Xue Li | ||
| Pages: 855-858 | ||
| doi>10.1145/2600428.2609458 | ||
|
Full text: |
||
|
With recent advances in radio-frequency identification (RFID), wireless sensor networks, and Web-based services, physical things are becoming an integral part of the emerging ubiquitous Web. In this paper, we focus on the things recommendation problem ...
expand
|
||
| Sig-SR: SimRank search over singular graphs | ||
| Weiren Yu, Julie A. McCann | ||
| Pages: 859-862 | ||
| doi>10.1145/2600428.2609459 | ||
|
Full text: |
||
|
SimRank is an attractive structural-context measure of similarity between two objects in a graph. It recursively follows the intuition that "two objects are similar if they are referenced by similar objects". The best known matrix-based method [1] for ...
expand
|
||
| Old dogs are great at new tricks: column stores for ir prototyping | ||
| Hannes Mühleisen, Thaer Samar, Jimmy Lin, Arjen de Vries | ||
| Pages: 863-866 | ||
| doi>10.1145/2600428.2609460 | ||
|
Full text: |
||
|
We make the suggestion that instead of implementing custom index structures and query evaluation algorithms, IR researchers should simply store document representations in a column-oriented relational database and implement ranking models using SQL. ...
expand
|
||
| The role of network distance in linkedin people search | ||
| Shih-Wen Huang, Daniel Tunkelang, Karrie Karahalios | ||
| Pages: 867-870 | ||
| doi>10.1145/2600428.2609461 | ||
|
Full text: |
||
|
LinkedIn is the world's largest professional network, with over 300 million members. One of the primary activities on the site is people search, for which LinkedIn members are both the users and the corpus. This paper presents insights about people search ...
expand
|
||
| Latent community discovery through enterprise user search query modeling | ||
| Kevin M. Carter, Rajmonda S. Caceres, Ben Priest | ||
| Pages: 871-874 | ||
| doi>10.1145/2600428.2609462 | ||
|
Full text: |
||
|
Enterprise computer networks are filled with users performing a variety of tasks, ranging from business-critical tasks to personal interest browsing. Due to this multi-modal distribution of behaviors, it is non-trivial to automatically discern which ...
expand
|
||
| Examining collaborative query reformulation: a case of travel information searching | ||
| Abu Shamim Mohammad Arif, Jia Tina Du, Ivan Lee | ||
| Pages: 875-878 | ||
| doi>10.1145/2600428.2609463 | ||
|
Full text: |
||
|
Users often reformulate or modify their queries when they engage in searching information particularly when the search task is complex and exploratory. This paper investigates query reformulation behavior in collaborative tourism information searching ...
expand
|
||
| Influential nodes selection: a data reconstruction perspective | ||
| Zhefeng Wang, Hao Wang, Qi Liu, Enhong Chen | ||
| Pages: 879-882 | ||
| doi>10.1145/2600428.2609464 | ||
|
Full text: |
||
|
Influence maximization is the problem of finding a set of seed nodes in social network for maximizing the spread of influence. Traditionally, researchers view influence propagation as a stochastic process and formulate the influence maximization problem ...
expand
|
||
| A fusion approach to cluster labeling | ||
| Haggai Roitman, Shay Hummel, Michal Shmueli-Scheuer | ||
| Pages: 883-886 | ||
| doi>10.1145/2600428.2609465 | ||
|
Full text: |
||
|
We present a novel approach to the cluster labeling task using fusion methods. The core idea of our approach is to weigh labels, suggested by any labeler, according to the estimated labeler's decisiveness with respect to each of its suggested labels. ...
expand
|
||
| Evaluating the effort involved in relevance assessments for images | ||
| Martin Halvey, Robert Villa | ||
| Pages: 887-890 | ||
| doi>10.1145/2600428.2609466 | ||
|
Full text: |
||
|
How assessors and end users judge the relevance of images has been studied in information science and information retrieval for a considerable time. The criteria by which assessors' judge relevance has been intensively studied, and there has been a large ...
expand
|
||
| Diversifying query suggestions based on query documents | ||
| Youngho Kim, W. Bruce Croft | ||
| Pages: 891-894 | ||
| doi>10.1145/2600428.2609467 | ||
|
Full text: |
||
|
Many domain-specific search tasks are initiated by document-length queries, e.g., patent invalidity search aims to find prior art related to a new (query) patent. We call this type of search Query Document Search. In this type of search, the initial ...
expand
|
||
| Comparing client and server dwell time estimates for click-level satisfaction prediction | ||
| Youngho Kim, Ahmed Hassan, Ryen W. White, Imed Zitouni | ||
| Pages: 895-898 | ||
| doi>10.1145/2600428.2609468 | ||
|
Full text: |
||
|
Click dwell time is the amount of time that a user spends on a clicked search result. Many previous studies have shown that click dwell time is strongly correlated with result-level satisfaction and document relevance. Accurate estimates of dwell time ...
expand
|
||
| Score-safe term-dependency processing with hybrid indexes | ||
| Matthias Petri, Alistair Moffat, J. Shane Culpepper | ||
| Pages: 899-902 | ||
| doi>10.1145/2600428.2609469 | ||
|
Full text: |
||
|
Score-safe index processing has received a great deal of attention over the last two decades. By pre-calculating maximum term impacts during indexing, the number of scoring operations can be minimized, and the top-k documents for a query can be located ...
expand
|
||
| Co-training on authorship attribution with very fewlabeled examples: methods vs. views | ||
| Tieyun Qian, Bing Liu, Ming Zhong, Guoliang He | ||
| Pages: 903-906 | ||
| doi>10.1145/2600428.2609470 | ||
|
Full text: |
||
|
Authorship attribution (AA) aims to identify the authors of a set of documents. Traditional studies in this area often assume that there are a large set of labeled documents available for training. However, in the real life, it is hard or expensive to ...
expand
|
||
| Probabilistic text modeling with orthogonalized topics | ||
| Enpeng Yao, Guoqing Zheng, Ou Jin, Shenghua Bao, Kailong Chen, Zhong Su, Yong Yu | ||
| Pages: 907-910 | ||
| doi>10.1145/2600428.2609471 | ||
|
Full text: |
||
|
Topic models have been widely used for text analysis. Previous topic models have enjoyed great success in mining the latent topic structure of text documents. With many efforts made on endowing the resulting document-topic distributions with different ...
expand
|
||
| Evaluating non-deterministic retrieval systems | ||
| Gaya K. Jayasinghe, William Webber, Mark Sanderson, Lasitha S. Dharmasena, J. Shane Culpepper | ||
| Pages: 911-914 | ||
| doi>10.1145/2600428.2609472 | ||
|
Full text: |
||
|
The use of sampling, randomized algorithms, or training based on the unpredictable inputs of users in Information Retrieval often leads to non-deterministic outputs. Evaluating the effectiveness of systems incorporating these methods can be challenging ...
expand
|
||
| Extending test collection pools without manual runs | ||
| Gaya K. Jayasinghe, William Webber, Mark Sanderson, J. Shane Culpepper | ||
| Pages: 915-918 | ||
| doi>10.1145/2600428.2609473 | ||
|
Full text: |
||
|
Information retrieval test collections traditionally use a combination of automatic and manual runs to create a pool of documents to be judged. The quality of the final judgments produced for a collection is a product of the variety across each of the ...
expand
|
||
| The search duel: a response to a strong ranker | ||
| Peter Izsak, Fiana Raiber, Oren Kurland, Moshe Tennenholtz | ||
| Pages: 919-922 | ||
| doi>10.1145/2600428.2609474 | ||
|
Full text: |
||
|
How can a search engine with a relatively weak relevance ranking function compete with a search engine that has a much stronger ranking function? This dual challenge, which to the best of our knowledge has not been addressed in previous work, entails ...
expand
|
||
| Modeling the evolution of product entities | ||
| Priya Radhakrishnan, Manish Gupta, Vasudeva Varma | ||
| Pages: 923-926 | ||
| doi>10.1145/2600428.2609475 | ||
|
Full text: |
||
|
A large number of web queries are related to product entities. Studying evolution of product entities can help analysts understand the change in particular attribute values for these products. However, studying the evolution of a product requires us ...
expand
|
||
| Predicting bursts and popularity of hashtags in real-time | ||
| Shoubin Kong, Qiaozhu Mei, Ling Feng, Fei Ye, Zhe Zhao | ||
| Pages: 927-930 | ||
| doi>10.1145/2600428.2609476 | ||
|
Full text: |
||
|
Hashtags have been widely used to annotate topics in tweets (short posts on Twitter.com). In this paper, we study the problems of real-time prediction of bursting hashtags. Will a hashtag burst in the near future? If it will, how early can we predict ...
expand
|
||
| Probabilistic ensemble learning for vietnamese word segmentation | ||
| Wuying Liu, Li Lin | ||
| Pages: 931-934 | ||
| doi>10.1145/2600428.2609477 | ||
|
Full text: |
||
|
Word segmentation is a challenging issue, and the corresponding algorithms can be used in many applications of natural language processing. This paper addresses the problem of Vietnamese word segmentation, proposes a probabilistic ensemble learning (PEL) ...
expand
|
||
| Improving unsupervised query segmentation using parts-of-speech sequence information | ||
| Rishiraj Saha Roy, Yogarshi Vyas, Niloy Ganguly, Monojit Choudhury | ||
| Pages: 935-938 | ||
| doi>10.1145/2600428.2609478 | ||
|
Full text: |
||
|
We present a generic method for augmenting unsupervised query segmentation by incorporating Parts-of-Speech (POS) sequence information to detect meaningful but rare n-grams. Our initial experiments with an existing English POS tagger employing two different ...
expand
|
||
| Building a query log via crowdsourcing | ||
| Omar Alonso, Maria Stone | ||
| Pages: 939-942 | ||
| doi>10.1145/2600428.2609479 | ||
|
Full text: |
||
|
A query log is a key asset in a commercial search engine. Everyday millions of users rely on search engines to find information on the Web by entering a few keywords on a simple search interface. Those queries represent a subset of user behavioral data ...
expand
|
||
| Web search without 'stupid' results | ||
| Aleksandra Lomakina, Nikita Povarov, Pavel Serdyukov | ||
| Pages: 943-946 | ||
| doi>10.1145/2600428.2609480 | ||
|
Full text: |
||
|
One of the main targets of any search engine is to make every user fully satisfied with her search results. For this reason, lots of efforts are being paid to improving ranking models in order to show the best results to users. However, there is a class ...
expand
|
||
| Discovering real-world use cases for a multimodal math search interface | ||
| Keita Del Valle Wangari, Richard Zanibbi, Anurag Agarwal | ||
| Pages: 947-950 | ||
| doi>10.1145/2600428.2609481 | ||
|
Full text: |
||
|
To use math expressions in search, current search engines require knowing expression names or using a structure editor or string encoding (e.g., LaTeX). For mathematical non-experts, this can lead to an "intention gap" between the query they wish to ...
expand
|
||
| Improving search personalisation with dynamic group formation | ||
| Thanh Tien Vu, Dawei Song, Alistair Willis, Son Ngoc Tran, Jingfei Li | ||
| Pages: 951-954 | ||
| doi>10.1145/2600428.2609482 | ||
|
Full text: |
||
|
Recent research has shown that the performance of search engines can be improved by enriching a user's personal profile with information about other users with shared interests. In the existing approaches, groups of similar users are often statically ...
expand
|
||
| Detection of abnormal profiles on group attacks in recommender systems | ||
| Wei Zhou, Yun Sing Koh, Junhao Wen, Shafiq Alam, Gillian Dobbie | ||
| Pages: 955-958 | ||
| doi>10.1145/2600428.2609483 | ||
|
Full text: |
||
|
Recommender systems using Collaborative Filtering techniques are capable of make personalized predictions. However, these systems are highly vulnerable to profile injection attacks. Group attacks are attacks that target a group of items instead of one, ...
expand
|
||
| On run diversity in Evaluation as a Service | ||
| Ellen M. Voorhees, Jimmy Lin, Miles Efron | ||
| Pages: 959-962 | ||
| doi>10.1145/2600428.2609484 | ||
|
Full text: |
||
|
"Evaluation as a service" (EaaS) is a new methodology that enables community-wide evaluations and the construction of test collections on documents that cannot be distributed. The basic idea is that evaluation organizers provide a service API through ...
expand
|
||
| Evaluating answer passages using summarization measures | ||
| Mostafa Keikha, Jae Hyun Park, W. Bruce Croft | ||
| Pages: 963-966 | ||
| doi>10.1145/2600428.2609485 | ||
|
Full text: |
||
|
Passage-based retrieval models have been studied for some time and have been shown to have some benefits for document ranking. Finding passages that are not only topically relevant, but are also answers to the users' questions would have a significant ...
expand
|
||
| Analyzing bias in CQA-based expert finding test sets | ||
| Reyyan Yeniterzi, Jamie Callan | ||
| Pages: 967-970 | ||
| doi>10.1145/2600428.2609486 | ||
|
Full text: |
||
|
Data retrieved from community question answering (CQA) sites, such as content and users' assessments of content, is commonly used for expertise estimation related tasks. One such task, in which the received votes are directly used as graded relevance ...
expand
|
||
| Understanding negation and family history to improve clinical information retrieval | ||
| Bevan Koopman, Guido Zuccon | ||
| Pages: 971-974 | ||
| doi>10.1145/2600428.2609487 | ||
|
Full text: |
||
|
We present a study to understand the effect that negated terms (e.g., "no fever") and family history (e.g., "family his- tory of diabetes") have on searching clinical records. Our analysis is aimed at devising the most effective means of handling negation ...
expand
|
||
| Modeling dual role preferences for trust-aware recommendation | ||
| Weilong Yao, Jing He, Guangyan Huang, Yanchun Zhang | ||
| Pages: 975-978 | ||
| doi>10.1145/2600428.2609488 | ||
|
Full text: |
||
|
Unlike in general recommendation scenarios where a user has only a single role, users in trust rating network, e.g. Epinions, are associated with two different roles simultaneously: as a truster and as a trustee. With different roles, users can show ...
expand
|
||
| Mouse movement during relevance judging: implications for determining user attention | ||
| Mark D. Smucker, Xiaoyu Sunny Guo, Andrew Toulis | ||
| Pages: 979-982 | ||
| doi>10.1145/2600428.2609489 | ||
|
Full text: |
||
|
Several researchers have found that a user's mouse position gives an indication of the user's gaze during web search and other tasks. As part of a user study that involved relevance judging of document summaries and full documents, we recorded users' ...
expand
|
||
| PAAP: prefetch-aware admission policies for query results cache in web search engines | ||
| Hongyuan Ma, Wei Liu, Bingjie Wei, Liang Shi, Xiuguo Bao, Lihong Wang, Bin Wang | ||
| Pages: 983-986 | ||
| doi>10.1145/2600428.2609490 | ||
|
Full text: |
||
|
Caching query results is an efficient technique for Web search engines. Admission policy can prevent infrequent queries from taking space of more frequent queries in the cache. In this paper we present two novel admission policies tailored for query ...
expand
|
||
| User geospatial context for music recommendation in microblogs | ||
| Markus Schedl, Andreu Vall, Katayoun Farrahi | ||
| Pages: 987-990 | ||
| doi>10.1145/2600428.2609491 | ||
|
Full text: |
||
|
Music information retrieval and music recommendation are seeing a paradigm shift towards methods that incorporate user context aspects. However, structured experiments on a standardized music dataset to investigate the effects of doing so are scarce. ...
expand
|
||
| Compositional data analysis (CoDA) approaches to distance in information retrieval | ||
| Paul Thomas, David Lovell | ||
| Pages: 991-994 | ||
| doi>10.1145/2600428.2609492 | ||
|
Full text: |
||
|
Many techniques in information retrieval produce counts from a sample, and it is common to analyse these counts as proportions of the whole---term frequencies are a familiar example. Proportions carry only relative information and are not free to vary ...
expand
|
||
| Group latent factor model for recommendation with multiple user behaviors | ||
| Jian Cheng, Ting Yuan, Jinqiao Wang, Hanqing Lu | ||
| Pages: 995-998 | ||
| doi>10.1145/2600428.2609493 | ||
|
Full text: |
||
|
Recently, some recommendation methods try to relieve the data sparsity problem of Collaborative Filtering by exploiting data from users' multiple types of behaviors. However, most of the exist methods mainly consider to model the correlation between ...
expand
|
||
| Hashing with List-Wise learning to rank | ||
| Zhou Yu, Fei Wu, Yin Zhang, Siliang Tang, Jian Shao, Yueting Zhuang | ||
| Pages: 999-1002 | ||
| doi>10.1145/2600428.2609494 | ||
|
Full text: |
||
|
Hashing techniques have been extensively investigated to boost similarity search for large-scale high-dimensional data. Most of the existing approaches formulate the their objective as a pair-wise similarity-preserving problem. In this paper, we consider ...
expand
|
||
| A burstiness-aware approach for document dating | ||
| Dimitrios Kotsakos, Theodoros Lappas, Dimitrios Kotzias, Dimitrios Gunopulos, Nattiya Kanhabua, Kjetil Nørvåg | ||
| Pages: 1003-1006 | ||
| doi>10.1145/2600428.2609495 | ||
|
Full text: |
||
|
A large number of mainstream applications, like temporal search, event detection, and trend identification, assume knowledge of the timestamp of every document in a given textual collection. In many cases, however, the required timestamps are either ...
expand
|
||
| An analysis of query difficulty for information retrieval in the medical domain | ||
| Lorraine Goeuriot, Liadh Kelly, Johannes Leveling | ||
| Pages: 1007-1010 | ||
| doi>10.1145/2600428.2609496 | ||
|
Full text: |
||
|
We present a post-hoc analysis of a benchmarking activity for information retrieval (IR) in the medical domain to determine if performance for queries with different levels of complexity can be associated with different IR methods or techniques. Our ...
expand
|
||
| Mobile query reformulations | ||
| Milad Shokouhi, Rosie Jones, Umut Ozertem, Karthik Raghunathan, Fernando Diaz | ||
| Pages: 1011-1014 | ||
| doi>10.1145/2600428.2609497 | ||
|
Full text: |
||
|
Users frequently interact with web search systems on their mobile devices via multiple modalities, including touch and speech. These interaction modes are substantially different from the user experience on desktop search. As a result, system designers ...
expand
|
||
| On peculiarities of positional effects in sponsored search | ||
| Vyacheslav Alipov, Valery Topinsky, Ilya Trofimov | ||
| Pages: 1015-1018 | ||
| doi>10.1145/2600428.2609498 | ||
|
Full text: |
||
|
Click logs provide a unique and highly valuable source of human judgments on ads' relevance. However, clicks are heavily biased by lots of factors. Two main factors that are widely acknowledged to be the most influential ones are neighboring ads and ...
expand
|
||
| A collective topic model for milestone paper discovery | ||
| Ziyu Lu, Nikos Mamoulis, David W. Cheung | ||
| Pages: 1019-1022 | ||
| doi>10.1145/2600428.2609499 | ||
|
Full text: |
||
|
Prior arts stay at the foundation for future work in academic research. However the increasingly large amount of publications makes it difficult for researchers to effectively discover the most important previous works to the topic of their research. ...
expand
|
||
| Document summarization based on word associations | ||
| Oskar Gross, Antoine Doucet, Hannu Toivonen | ||
| Pages: 1023-1026 | ||
| doi>10.1145/2600428.2609500 | ||
|
Full text: |
||
|
In the age of big data, automatic methods for creating summaries of documents become increasingly important. In this paper we propose a novel, unsupervised method for (multi-)document summarization. In an unsupervised and language-independent fashion, ...
expand
|
||
| Do users rate or review?: boost phrase-level sentiment labeling with review-level sentiment classification | ||
| Yongfeng Zhang, Haochen Zhang, Min Zhang, Yiqun Liu, Shaoping Ma | ||
| Pages: 1027-1030 | ||
| doi>10.1145/2600428.2609501 | ||
|
Full text: |
||
|
Current approaches for contextual sentiment lexicon construction in phrase-level sentiment analysis assume that the numerical star rating of a review represents the overall sentiment orientation of the review text. Although widely adopted, we find through ...
expand
|
||
| Random subspace for binary codes learning in large scale image retrieval | ||
| Cong Leng, Jian Cheng, Hanqing Lu | ||
| Pages: 1031-1034 | ||
| doi>10.1145/2600428.2609502 | ||
|
Full text: |
||
|
Due to the fast query speed and low storage cost, hashing based approximate nearest neighbor search methods have attracted much attention recently. Many state of the art methods are based on eigenvalue decomposition. In these approaches, the information ...
expand
|
||
| Incorporating query-specific feedback into learning-to-rank models | ||
| Ethem F. Can, W. Bruce Croft, R. Manmatha | ||
| Pages: 1035-1038 | ||
| doi>10.1145/2600428.2609503 | ||
|
Full text: |
||
|
Relevance feedback has been shown to improve retrieval for a broad range of retrieval models. It is the most common way of adapting a retrieval model for a specific query. In this work, we expand this common way by focusing on an approach that enables ...
expand
|
||
| Large-scale author verification: temporal and topical influences | ||
| Michiel van Dam, Claudia Hauff | ||
| Pages: 1039-1042 | ||
| doi>10.1145/2600428.2609504 | ||
|
Full text: |
||
|
The task of author verification is concerned with the question whether or not someone is the author of a given piece of text. Algorithms that extract writing style features from texts are used to determine how close in style different documents are. ...
expand
|
||
| Evaluating mobile web search performance by taking good abandonment into account | ||
| Olga Arkhipova, Lidia Grauer | ||
| Pages: 1043-1046 | ||
| doi>10.1145/2600428.2609505 | ||
|
Full text: |
||
|
Usage of mobile devices for Web search grows rapidly in recent years. The common tendency is that users want to receive information immediately results in incorporating rich snippets and vertical results into search engine result pages (SERPs) and in ...
expand
|
||
| Assessing the reliability and reusability of an E-discovery privilege test collection | ||
| Jyothi K. Vinjumur, Douglas W. Oard, Jiaul H. Paik | ||
| Pages: 1047-1050 | ||
| doi>10.1145/2600428.2609506 | ||
|
Full text: |
||
|
In some jurisdictions, parties to a lawsuit can request documents from each other, but documents subject to a claim of privilege may be withheld. The TREC 2010 Legal Track developed what is presently the only public test collection for evaluating privilege ...
expand
|
||
| Modeling evolution of a social network using temporalgraph kernels | ||
| Akash Anil, Niladri Sett, Sanasam Ranbir Singh | ||
| Pages: 1051-1054 | ||
| doi>10.1145/2600428.2609507 | ||
|
Full text: |
||
|
Majority of the studies on modeling the evolution of a social network using spectral graph kernels do not consider temporal effects while estimating the kernel parameters. As a result, such kernels fail to capture structural properties of the evolution ...
expand
|
||
| On user interactions with query auto-completion | ||
| Bhaskar Mitra, Milad Shokouhi, Filip Radlinski, Katja Hofmann | ||
| Pages: 1055-1058 | ||
| doi>10.1145/2600428.2609508 | ||
|
Full text: |
||
|
Query Auto-Completion (QAC) is a popular feature of web search engines that aims to assist users to formulate queries faster and avoid spelling mistakes by presenting them with possible completions as soon as they start typing. However, despite the wide ...
expand
|
||
| Re-ranking approach to classification in large-scale power-law distributed category systems | ||
| Rohit Babbar, Ioannis Partalas, Eric Gaussier, Massih-reza Amini | ||
| Pages: 1059-1062 | ||
| doi>10.1145/2600428.2609509 | ||
|
Full text: |
||
|
For large-scale category systems, such as Directory Mozilla, which consist of tens of thousand categories, it has been empirically verified in earlier studies that the distribution of documents among categories can be modeled as a power-law distribution. ...
expand
|
||
| Enhancing personalization via search activity attribution | ||
| Adish Singla, Ryen W. White, Ahmed Hassan, Eric Horvitz | ||
| Pages: 1063-1066 | ||
| doi>10.1145/2600428.2609510 | ||
|
Full text: |
||
|
Online services rely on machine identifiers to tailor services such as personalized search and advertising to individual users. The assumption made is that each identifier comprises the behavior of a single person. However, shared machine usage is common, ...
expand
|
||
| A syntax-aware re-ranker for microblog retrieval | ||
| Aliaksei Severyn, Alessandro Moschitti, Manos Tsagkias, Richard Berendsen, Maarten de Rijke | ||
| Pages: 1067-1070 | ||
| doi>10.1145/2600428.2609511 | ||
|
Full text: |
||
|
We tackle the problem of improving microblog retrieval algorithms by proposing a robust structural representation of (query, tweet) pairs. We employ these structures in a principled kernel learning framework that automatically extracts and learns highly ...
expand
|
||
| Weighted aspect-based collaborative filtering | ||
| YanPing Nie, Yang Liu, Xiaohui Yu | ||
| Pages: 1071-1074 | ||
| doi>10.1145/2600428.2609512 | ||
|
Full text: |
||
|
Existing work on collaborative filtering (CF) is often based on the overall ratings the items have received. However, in many cases, understanding how a user rates each aspect of an item may reveal more detailed information about her preferences and ...
expand
|
||
| Evaluating intuitiveness of vertical-aware click models | ||
| Aleksandr Chuklin, Ke Zhou, Anne Schuth, Floor Sietsma, Maarten de Rijke | ||
| Pages: 1075-1078 | ||
| doi>10.1145/2600428.2609513 | ||
|
Full text: |
||
|
Modeling user behavior on a search engine result page is important for understanding the users and supporting simulation experiments. As result pages become more complex, click models evolve as well in order to capture additional aspects of user behavior ...
expand
|
||
| Recipient recommendation in enterprises using communication graphs and email content | ||
| David Graus, David van Dijk, Manos Tsagkias, Wouter Weerkamp, Maarten de Rijke | ||
| Pages: 1079-1082 | ||
| doi>10.1145/2600428.2609514 | ||
|
Full text: |
||
|
We address the task of recipient recommendation for emailing in enterprises. We propose an intuitive and elegant way of modeling the task of recipient recommendation, which uses both the communication graph (i.e., who are most closely connected to the ...
expand
|
||
| Analyzing the content emphasis of web search engines | ||
| Mohammed A. Alam, Doug Downey | ||
| Pages: 1083-1086 | ||
| doi>10.1145/2600428.2609515 | ||
|
Full text: |
||
|
Millions of people search the Web each day. As a consequence, the ranking algorithms employed by Web search engines have a profound influence on which pages users visit. Characterizing this influence, and informing users when different engines favor ...
expand
|
||
| Effects of task and domain on searcher attention | ||
| Dmitry Lagun, Eugene Agichtein | ||
| Pages: 1087-1090 | ||
| doi>10.1145/2600428.2609516 | ||
|
Full text: |
||
|
Previous studies of online user attention during information seeking tasks have mainly focused on analyzing searcher behavior in the web search settings. While these studies enabled better understanding of search result examination, their findings might ...
expand
|
||
| Learning sufficient queries for entity filtering | ||
| Miles Efron, Craig Willis, Garrick Sherman | ||
| Pages: 1091-1094 | ||
| doi>10.1145/2600428.2609517 | ||
|
Full text: |
||
|
Entity-centric document filtering is the task of analyzing a time-ordered stream of documents and emitting those that are relevant to a specified set of entities (e.g., people, places, organizations). This task is exemplified by the TREC Knowledge Base ...
expand
|
||
| PatentLine: analyzing technology evolution on multi-view patent graphs | ||
| Longhui Zhang, Lei Li, Tao Li, Qi Zhang | ||
| Pages: 1095-1098 | ||
| doi>10.1145/2600428.2609518 | ||
|
Full text: |
||
|
The fast growth of technologies has driven the advancement of our society. It is often necessary to quickly grab the evolution of technologies in order to better understand the technology trend. The availability of huge volumes of granted patent documents ...
expand
|
||
| Query performance prediction for entity retrieval | ||
| Hadas Raviv, Oren Kurland, David Carmel | ||
| Pages: 1099-1102 | ||
| doi>10.1145/2600428.2609519 | ||
|
Full text: |
||
|
We address the query-performance-prediction task for entity retrieval; that is, retrieval effectiveness is estimated with no relevance judgements. First we show how to adapt state-of-the-art query-performance predictors proposed for document retrieval ...
expand
|
||
| Second order probabilistic models for within-document novelty detection in academic articles | ||
| Laurence A.F. Park, Simeon Simoff | ||
| Pages: 1103-1106 | ||
| doi>10.1145/2600428.2609520 | ||
|
Full text: |
||
|
It is becoming increasingly difficult to stay aware of the state-of-the-art in any research field due to the exponential increase in the number of academic publications. This problem effects authors and reviewers of submissions to academic journals and ...
expand
|
||
| Modeling the dynamics of personal expertise | ||
| Yi Fang, Archana Godavarthy | ||
| Pages: 1107-1110 | ||
| doi>10.1145/2600428.2609521 | ||
|
Full text: |
||
|
Personal expertise or interests often evolve over time. Despite much work on expertise retrieval in the recent years, very little work has studied the dynamics of personal expertise. In this paper, we propose a probabilistic model to characterize how ...
expand
|
||
| An annotation similarity model in passage ranking for historical fact validation | ||
| Jun Araki, Jamie Callan | ||
| Pages: 1111-1114 | ||
| doi>10.1145/2600428.2609522 | ||
|
Full text: |
||
|
State-of-the-art question answering (QA) systems employ passage retrieval based on bag-of-words similarity models with respect to a query and a passage. We propose a combination of a traditional bag-of-words similarity model and an annotation similarity ...
expand
|
||
| To hint or not: exploring the effectiveness of search hints for complex informational tasks | ||
| Denis Savenkov, Eugene Agichtein | ||
| Pages: 1115-1118 | ||
| doi>10.1145/2600428.2609523 | ||
|
Full text: |
||
|
Extensive previous research has shown that searchers often require assistance with query formulation and refinement. Yet, it is not clear what kind of assistance is most useful, and how effective it is both objectively (e.g., in terms of task success) ...
expand
|
||
| The effect of sampling strategy on inferred measures | ||
| Ellen M. Voorhees | ||
| Pages: 1119-1122 | ||
| doi>10.1145/2600428.2609524 | ||
|
Full text: |
||
|
Using the inferred measures framework is a popular choice for constructing test collections when the target document set is too large for pooling to be a viable option. Within the framework, different amounts of assessing effort is placed on different ...
expand
|
||
| Cache-conscious runtime optimization for ranking ensembles | ||
| Xun Tang, Xin Jin, Tao Yang | ||
| Pages: 1123-1126 | ||
| doi>10.1145/2600428.2609525 | ||
|
Full text: |
||
|
Multi-tree ensemble models have been proven to be effective for document ranking. Using a large number of trees can improve accuracy, but it takes time to calculate ranking scores of matched documents. This paper investigates data traversal methods for ...
expand
|
||
| Bridging temporal context gaps using time-aware re-contextualization | ||
| Andrea Ceroni, Nam Khanh Tran, Nattiya Kanhabua, Claudia Niederée | ||
| Pages: 1127-1130 | ||
| doi>10.1145/2600428.2609526 | ||
|
Full text: |
||
|
Understanding a text, which was written some time ago, can be compared to translating a text from another language. Complete interpretation requires a mapping, in this case, a kind of time-travel translation between present context knowledge and context ...
expand
|
||
| An enhanced context-sensitive proximity model for probabilistic information retrieval | ||
| Jiashu Zhao, Jimmy Xiangji Huang | ||
| Pages: 1131-1134 | ||
| doi>10.1145/2600428.2609527 | ||
|
Full text: |
||
|
We propose to enhance proximity-based probabilistic retrieval models with more contextual information. A term pair with higher contextual relevance of term proximity is assigned a higher weight. Several measures are proposed to estimate the contextual ...
expand
|
||
| On the information difference between standard retrieval models | ||
| Peter B. Golbus, Javed A. Aslam | ||
| Pages: 1135-1138 | ||
| doi>10.1145/2600428.2609528 | ||
|
Full text: |
||
|
Recent work introduced a probabilistic framework that measures search engine performance information-theoretically. This allows for novel meta-evaluation measures such as Information Difference, which measures the magnitude of the difference between ...
expand
|
||
| A POMDP model for content-free document re-ranking | ||
| Sicong Zhang, Jiyun Luo, Hui Yang | ||
| Pages: 1139-1142 | ||
| doi>10.1145/2600428.2609529 | ||
|
Full text: |
||
|
Log-based document re-ranking is a special form of session search. The task re-ranks documents from Search Engine Results Page (SERP) according to the search logs, in which both the search activities from other users and personalized query log for a ...
expand
|
||
| Using score differences for search result diversification | ||
| Sadegh Kharazmi, Mark Sanderson, Falk Scholer, David Vallet | ||
| Pages: 1143-1146 | ||
| doi>10.1145/2600428.2609530 | ||
|
Full text: |
||
|
We investigate the application of a light-weight approach to result list clustering for the purposes of diversifying search results. We introduce a novel post-retrieval approach, which is independent of external information or even the full-text content ...
expand
|
||
| TREC: topic engineering exercise | ||
| J Shane Culpepper, Stefano Mizzaro, Mark Sanderson, Falk Scholer | ||
| Pages: 1147-1150 | ||
| doi>10.1145/2600428.2609531 | ||
|
Full text: |
||
|
In this work, we investigate approaches to engineer better topic sets in information retrieval test collections. By recasting the TREC evaluation exercise from one of building more effective systems to an exercise in building better topics, we present ...
expand
|
||
| How k-12 students search for learning?: analysis of an educational search engine log | ||
| Arif Usta, Ismail Sengor Altingovde, İbrahim Bahattin Vidinli, Rifat Ozcan, Özgür Ulusoy | ||
| Pages: 1151-1154 | ||
| doi>10.1145/2600428.2609532 | ||
|
Full text: |
||
|
In this study, we analyze an educational search engine log for shedding light on K-12 students' search behavior in a learning environment. We specially focus on query, session, user and click characteristics and compare the trends to the findings in ...
expand
|
||
| The correlation between cluster hypothesis tests and the effectiveness of cluster-based retrieval | ||
| Fiana Raiber, Oren Kurland | ||
| Pages: 1155-1158 | ||
| doi>10.1145/2600428.2609533 | ||
|
Full text: |
||
|
We present a study of the correlation between the extent to which the cluster hypothesis holds, as measured by various tests, and the relative effectiveness of cluster-based retrieval with respect to document-based retrieval. We show that the correlation ...
expand
|
||
| The effect of expanding relevance judgements with duplicates | ||
| Gaurav Baruah, Adam Roegiest, Mark D. Smucker | ||
| Pages: 1159-1162 | ||
| doi>10.1145/2600428.2609534 | ||
|
Full text: |
||
|
We examine the effects of expanding a judged set of sentences with their duplicates from a corpus. Including new sentences that are exact duplicates of the previously judged sentences may allow for better estimation of performance metrics and enhance ...
expand
|
||
| On correlation of absence time and search effectiveness | ||
| Sunandan Chakraborty, Filip Radlinski, Milad Shokouhi, Paul Baecke | ||
| Pages: 1163-1166 | ||
| doi>10.1145/2600428.2609535 | ||
|
Full text: |
||
|
Online search evaluation metrics are typically derived based on implicit feedback from the users. For instance, computing the number of page clicks, number of queries, or dwell time on a search result. In a recent paper, Dupret and Lalmas introduced ...
expand
|
||
| Necessary and frequent terms in queries | ||
| Jiepu Jiang, James Allan | ||
| Pages: 1167-1170 | ||
| doi>10.1145/2600428.2609536 | ||
|
Full text: |
||
|
Vocabulary mismatch has long been recognized as one of the major issues affecting search effectiveness. Ineffective queries usually fail to incorporate important terms and/or incorrectly include inappropriate keywords. However, in this paper we show ...
expand
|
||
| Extracting topics based on authors, recipients and content in microblogs | ||
| Nazneen Fatema N. Rajani, Kate McArdle, Jason Baldridge | ||
| Pages: 1171-1174 | ||
| doi>10.1145/2600428.2609537 | ||
|
Full text: |
||
|
Microblogs such as Twitter are important sources for spreading vital information at high speed. They also reflect the general people's reaction and opinion towards major events or stories. With information traveling so quickly, it is helpful to be able ...
expand
|
||
| Exploiting Twitter and Wikipedia for the annotation of event images | ||
| Philip James McParlane, Joemon Jose | ||
| Pages: 1175-1178 | ||
| doi>10.1145/2600428.2609538 | ||
|
Full text: |
||
|
With the rise in popularity of smart phones, there has been a recent increase in the number of images taken at large social (e.g. festivals) and world (e.g. natural disasters) events which are uploaded to image sharing websites such as Flickr. As with ...
expand
|
||
| Learning to translate queries for CLIR | ||
| Artem Sokolov, Felix Hieber, Stefan Riezler | ||
| Pages: 1179-1182 | ||
| doi>10.1145/2600428.2609539 | ||
|
Full text: |
||
|
The statistical machine translation (SMT) component of cross-lingual information retrieval (CLIR) systems is often regarded as black box that is optimized for translation quality independent from the retrieval task. In recent work [10], SMT has been ...
expand
|
||
| Predicting query performance in microblog retrieval | ||
| Jesus A. Rodriguez Perez, Joemon M. Jose | ||
| Pages: 1183-1186 | ||
| doi>10.1145/2600428.2609540 | ||
|
Full text: |
||
|
Query Performance Prediction (QPP) is the estimation of the retrieval success for a query, without explicit knowledge about relevant documents. QPP is especially interesting in the context of Automatic Query Expansion (AQE) based on Pseudo Relevance ...
expand
|
||
| An event extraction model based on timeline and user analysis in Latent Dirichlet allocation | ||
| Bayar Tsolmon, Kyung-Soon Lee | ||
| Pages: 1187-1190 | ||
| doi>10.1145/2600428.2609541 | ||
|
Full text: |
||
|
Social media such as Twitter has come to reflect the reaction of the general public to major events. Since posts are short and noisy, it is hard to extract reliable events based on word frequency. Even though an event term appears in a particularly low ...
expand
|
||
| What makes data robust: a data analysis in learning to rank | ||
| Shuzi Niu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng, Xiubo Geng | ||
| Pages: 1191-1194 | ||
| doi>10.1145/2600428.2609542 | ||
|
Full text: |
||
|
When applying learning to rank algorithms in real search applications, noise in human labeled training data becomes an inevitable problem which will affect the performance of the algorithms. Previous work mainly focused on studying how noise affects ...
expand
|
||
| Learning to bridge colloquial and formal language applied to linking and search of E-Commerce data | ||
| Ivan Vulić, Susana Zoghbi, Marie-Francine Moens | ||
| Pages: 1195-1198 | ||
| doi>10.1145/2600428.2609543 | ||
|
Full text: |
||
|
We study the problem of linking information between different idiomatic usages of the same language, for example, colloquial and formal language. We propose a novel probabilistic topic model called multi-idiomatic LDA (MiLDA). Its modeling principles ...
expand
|
||
| Uncovering the unarchived web | ||
| Thaer Samar, Hugo C. Huurdeman, Anat Ben-David, Jaap Kamps, Arjen de Vries | ||
| Pages: 1199-1202 | ||
| doi>10.1145/2600428.2609544 | ||
|
Full text: |
||
|
Many national and international heritage institutes realize the importance of archiving the web for future culture heritage. Web archiving is currently performed either by harvesting a national domain, or by crawling a pre-defined list of websites selected ...
expand
|
||
| Inferring topic-dependent influence roles of Twitter users | ||
| Chengyao Chen, Dehong Gao, Wenjie Li, Yuexian Hou | ||
| Pages: 1203-1206 | ||
| doi>10.1145/2600428.2609545 | ||
|
Full text: |
||
|
Twitter, as one of the most popular social media platforms, provides a convenient way for people to communicate and interact with each other. It has been well recognized that influence exists during users' interactions. Some pioneer studies on finding ...
expand
|
||
| Reputation analysis with a ranked sentiment-lexicon | ||
| Filipa Peleja, João Santos, João Magalhães | ||
| Pages: 1207-1210 | ||
| doi>10.1145/2600428.2609546 | ||
|
Full text: |
||
|
Reputation analysis is naturally linked to a sentiment analysis task of the targeted entities. This analysis leverages on a sentiment lexicon that includes general sentiment words and domain specific jargon. However, in most cases target entities are ...
expand
|
||
| On predicting religion labels in microblogging networks | ||
| Minh-Thap Nguyen, Ee-Peng Lim | ||
| Pages: 1211-1214 | ||
| doi>10.1145/2600428.2609547 | ||
|
Full text: |
||
|
Religious belief plays an important role in how people behave, influencing how they form preferences, interpret events around them, and develop relationships with others. Traditionally, the religion labels of user population are obtained by conducting ...
expand
|
||
| Efficiently identify local frequent keyword co-occurrence patterns in geo-tagged Twitter stream | ||
| Xiaoyang Wang, Ying Zhang, Wenjie Zhang, Xuemin Lin | ||
| Pages: 1215-1218 | ||
| doi>10.1145/2600428.2609548 | ||
|
Full text: |
||
|
With the prevalence of the geo-position enabled devices and services, a rapidly growing amount of tweets are associated with geo-tags. Consequently, the real time search on geo-tagged Twitter streams has attracted great attentions.In this paper, we advocate ...
expand
|
||
| Item group based pairwise preference learning for personalized ranking | ||
| Shuang Qiu, Jian Cheng, Ting Yuan, Cong Leng, Hanqing Lu | ||
| Pages: 1219-1222 | ||
| doi>10.1145/2600428.2609549 | ||
|
Full text: |
||
|
Collaborative filtering with implicit feedbacks has been steadily receiving more attention, since the abundant implicit feedbacks are more easily collected while explicit feedbacks are not necessarily always available. Several recent work address this ...
expand
|
||
| Where not to go?: detecting road hazards using twitter | ||
| Avinash Kumar, Miao Jiang, Yi Fang | ||
| Pages: 1223-1226 | ||
| doi>10.1145/2600428.2609550 | ||
|
Full text: |
||
|
Conventional approaches to road hazard detection involve manual inspections of roads by government transportation agencies. These approaches are usually expensive to execute, and sometimes are not able to capture the most recent hazards. Moreover, they ...
expand
|
||
| Enhancing sketch-based sport video retrieval by suggesting relevant motion paths | ||
| Ihab Al Kabary, Heiko Schuldt | ||
| Pages: 1227-1230 | ||
| doi>10.1145/2600428.2609551 | ||
|
Full text: |
||
|
Searching for scenes in team sport videos is a task that recurs very often in game analysis and other related activities performed by coaches. In most cases, queries are formulated on the basis of specific motion characteristics the user remembers from ...
expand
|
||
| Dynamic location models | ||
| Vanessa Murdock | ||
| Pages: 1231-1234 | ||
| doi>10.1145/2600428.2609552 | ||
|
Full text: |
||
|
Location models built on social media have been shown to be an important step toward understanding places in queries. Current search technology focuses on predicting broad regions such as cities. Hyperlocal scenarios are important because of the increasing ...
expand
|
||
| Wikipedia-based query performance prediction | ||
| Gilad Katz, Anna Shtock, Oren Kurland, Bracha Shapira, Lior Rokach | ||
| Pages: 1235-1238 | ||
| doi>10.1145/2600428.2609553 | ||
|
Full text: |
||
|
The query-performance prediction task is to estimate retrieval effectiveness with no relevance judgments. Pre-retrieval prediction methods operate prior to retrieval time. Hence, these predictors are often based on analyzing the query and the corpus ...
expand
|
||
| A revisit to social network-based recommender systems | ||
| Hui Li, Dingming Wu, Nikos Mamoulis | ||
| Pages: 1239-1242 | ||
| doi>10.1145/2600428.2609554 | ||
|
Full text: |
||
|
With the rapid expansion of online social networks, social network-based recommendation has become a meaningful and effective way of suggesting new items or activities to users. In this paper, we propose two methods to improve the performance of the ...
expand
|
||
| DEMONSTRATION SESSION: Demo session | ||
| Relevation!: An open source system for information retrieval relevance assessment | ||
| Bevan Koopman, Guido Zuccon | ||
| Pages: 1243-1244 | ||
| doi>10.1145/2600428.2611175 | ||
|
Full text: |
||
|
Relevation! is a system for performing relevance judgements for information retrieval evaluation. Relevation! is web-based, fully configurable and expandable; it allows researchers to effectively collect assessments and additional qualitative data. The ...
expand
|
||
| WenZher: comprehensive vertical search for healthcare domain | ||
| Liqiang Nie, Tao Li, Mohammad Akbari, Jialie Shen, Tat-Seng Chua | ||
| Pages: 1245-1246 | ||
| doi>10.1145/2600428.2611176 | ||
|
Full text: |
||
|
Online health seeking has transformed the way of health knowledge exchange and reusability. The existing general and vertical health search engines, however, just routinely return lists of matched documents or question answer (QA) pairs, which may overwhelm ...
expand
|
||
| STICS: searching with strings, things, and cats | ||
| Johannes Hoffart, Dragan Milchevski, Gerhard Weikum | ||
| Pages: 1247-1248 | ||
| doi>10.1145/2600428.2611177 | ||
|
Full text: |
||
|
This paper describes an advanced search engine that supports users in querying documents by means of keywords, entities, and categories. Users simply type words, which are automatically mapped onto appropriate suggestions for entities and categories. ...
expand
|
||
| VIRLab: a web-based virtual lab for learning and studying information retrieval models | ||
| Hui Fang, Hao Wu, Peilin Yang, ChengXiang Zhai | ||
| Pages: 1249-1250 | ||
| doi>10.1145/2600428.2611178 | ||
|
Full text: |
||
|
In this paper, we describe VIRLab, a novel web-based virtual laboratory for Information Retrieval (IR). Unlike existing command line based IR toolkits, the VIRLab system provides a more interactive tool that enables easy implementation of retrieval functions ...
expand
|
||
| ServiceXplorer: a similarity-based web service search engine | ||
| Anne H.H. Ngu, Jiangang Ma, Quan Z. Sheng, Lina Yao, Scott Julian | ||
| Pages: 1251-1252 | ||
| doi>10.1145/2600428.2611179 | ||
|
Full text: |
||
|
Finding relevant Web services and composing them into value-added applications is becoming increasingly important in cloud and service based marketplaces. The key problem with current approaches to finding relevant Web services is that most of them only ...
expand
|
||
| Real-time visualization and targeting of online visitors | ||
| Deepak Pai, Sandeep Zechariah George | ||
| Pages: 1253-1254 | ||
| doi>10.1145/2600428.2611180 | ||
|
Full text: |
||
|
Identifying and targeting visitors on an e-commerce website with personalized content in real-time is extremely important to marketers. Although such targeting exists today, it is based on demographic attributes of the visitors. We show that dynamic ...
expand
|
||
| CharBoxes: a system for automatic discovery of character infoboxes from books | ||
| Manish Gupta, Piyush Bansal, Vasudeva Varma | ||
| Pages: 1255-1256 | ||
| doi>10.1145/2600428.2611181 | ||
|
Full text: |
||
|
Entities are centric to a large number of real world applications. Wikipedia shows entity infoboxes for a large number of entities. However, not much structured information is available about character entities in books. Automatic discovery of characters ...
expand
|
||
| ADAM: a system for jointly providing ir and database queries in large-scale multimedia retrieval | ||
| Ivan Giangreco, Ihab Al Kabary, Heiko Schuldt | ||
| Pages: 1257-1258 | ||
| doi>10.1145/2600428.2611182 | ||
|
Full text: |
||
|
The tremendous increase of multimedia data in recent years has heightened the need for systems that not only allow to search with keywords, but that also support content-based retrieval in order to effectively and efficiently query large collections. ...
expand
|
||
| NicePic!: a system for extracting attractive photos from flickr streams | ||
| Sergej Zerr, Stefan Siersdorfer, Jose San Pedro, Jonathon Hare, Xiaofei Zhu | ||
| Pages: 1259-1260 | ||
| doi>10.1145/2600428.2611183 | ||
|
Full text: |
||
|
A large number of images are continuously uploaded to popular photo sharing websites and online social communities. In this demonstration we show a novel application which automatically classifies images in a live photo stream according to their attractiveness ...
expand
|
||
| A perspective-aware approach to search: visualizing perspectives in news search results | ||
| Muhammad Atif Qureshi, Colm O'Riordan, Gabriella Pasi | ||
| Pages: 1261-1262 | ||
| doi>10.1145/2600428.2611184 | ||
|
Full text: |
||
|
The result set from a search engine for any user's query may exhibit an inherent perspective due to issues with the search engine or issues with the underlying collection. This demonstration paper presents a system that allows users to specify at query ...
expand
|
||
| FitYou: integrating health profiles to real-time contextual suggestion | ||
| Christopher Wing, Hui Yang | ||
| Pages: 1263-1264 | ||
| doi>10.1145/2600428.2611185 | ||
|
Full text: |
||
|
Obesity and its associated health consequences such as high blood pressure and cardiac disease affect a significant proportion of the world's population. At the same time, the popularity of location-based services (LBS) and recommender systems is continually ...
expand
|
||
| Semantic full-text search with broccoli | ||
| Hannah Bast, Florian Bäurle, Björn Buchhold, Elmar Haußmann | ||
| Pages: 1265-1266 | ||
| doi>10.1145/2600428.2611186 | ||
|
Full text: |
||
|
We combine search in triple stores with full-text search into what we call \emph{semantic full-text search}. We provide a fully functional web application that allows the incremental construction of complex queries on the English Wikipedia combined with ...
expand
|
||
| Just-for-me: an adaptive personalization system for location-aware social music recommendation | ||
| Zhiyong Cheng, Jialie Shen, Tao Mei | ||
| Pages: 1267-1268 | ||
| doi>10.1145/2600428.2611187 | ||
|
Full text: |
||
|
In recent years, location-aware music recommendation is increasing in popularity, as more and more users consume music on the move. In this demonstration, we present an intelligent system, called Just-for-Me, to facilitate accurate music recommendation ...
expand
|
||
| A novel system for the semi automatic annotation of event images | ||
| Philip James McParlane, Joemon Jose | ||
| Pages: 1269-1270 | ||
| doi>10.1145/2600428.2611188 | ||
|
Full text: |
||
|
With the rise in popularity of smart phones, taking and sharing photographs has never been more openly accessible. Further, photo sharing websites, such as Flickr, have made the distribution of photographs easy, resulting in an increase of visual content ...
expand
|
||
| An interactive interface for visualizing events on Twitter | ||
| Andrew J. McMinn, Daniel Tsvetkov, Tsvetan Yordanov, Andrew Patterson, Rrobi Szk, Jesus A. Rodriguez Perez, Joemon M. Jose | ||
| Pages: 1271-1272 | ||
| doi>10.1145/2600428.2611189 | ||
|
Full text: |
||
|
In recent years, social media has become one of the most popular tools for discovering and following breaking news and ongoing events. However tools and interfaces have lagged behind users' expectations, with current tools making it difficult to discover ...
expand
|
||
| ExperTime: tracking expertise over time | ||
| Jan Rybak, Krisztian Balog, Kjetil Nørvåg | ||
| Pages: 1273-1274 | ||
| doi>10.1145/2600428.2611190 | ||
|
Full text: |
||
|
This paper presents ExperTime, a web-based system for tracking expertise over time. We visualize a person's expertise profile on a timeline, where we detect and characterize changes in the focus or topics of expertise. It is possible to zoom in on a ...
expand
|
||
| SESSION: Doctoral consortium | ||
| J. Shane Culpepper | ||
| Cluster links prediction for literature based discovery using latent structure and semantic features | ||
| Yakub Sebastian | ||
| Pages: 1275-1275 | ||
| doi>10.1145/2600428.2610376 | ||
|
Full text: |
||
|
The potential impact of a scientific article has a significant correlation with its ability to establish novel connections between pre-existing knowledge [1-2]. Discovering hidden connections between the existing scientific literature is an interesting ...
expand
|
||
| Graph-based large scale RDF data compression | ||
| Wei Emma Zhang | ||
| Pages: 1276-1276 | ||
| doi>10.1145/2600428.2610377 | ||
|
Full text: |
||
|
We propose a two-stage lossless compression approach on large scale RDF data. Our approach exploits both Representation Compression and Component Compression techniques to support query and dynamic operations directly on the compressed data.
expand
|
||
| Entity-based retrieval | ||
| Hadas Raviv | ||
| Pages: 1277-1277 | ||
| doi>10.1145/2600428.2610378 | ||
|
Full text: |
||
|
We address the core challenge of the entity retrieval task: ranking entities in response to a query by their presumed relevance to the information need that the query represents. As an initial research direction we explored two models for entity ranking ...
expand
|
||
| Improving offline and online web search evaluation by modelling the user behaviour | ||
| Eugene Kharitonov | ||
| Pages: 1278-1278 | ||
| doi>10.1145/2600428.2610379 | ||
|
Full text: |
||
|
Measurements are fundamental to any empirical science and, similarly, search evaluation is a vital part of information retrieval (IR). Evaluation ensures the progressive development of approaches, tools, and methods studied in this field. Apart from ...
expand
|
||
| Modelling of terms across scripts through autoencoders | ||
| Parth Gupta | ||
| Pages: 1279-1279 | ||
| doi>10.1145/2600428.2610380 | ||
|
Full text: |
||
|
cripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in the Roman script. Such content creates a monolingual or cross-lingual space with more than one scripts which is ...
expand
|
||
| A tag-based personalized item recommendation system using tensor modeling and topic model approaches | ||
| Noor Ifada | ||
| Pages: 1280-1280 | ||
| doi>10.1145/2600428.2610381 | ||
|
Full text: |
||
|
This research falls in the area of enhancing the quality of tag-based item recommendation systems. It aims to achieve this by employing a multi-dimensional user profile approach and by analyzing the semantic aspects of tags. Tag-based recommender systems ...
expand
|
||
| Novelty and diversity enhancement and evaluation in recommender systems and information retrieval | ||
| Saúl Vargas | ||
| Pages: 1281-1281 | ||
| doi>10.1145/2600428.2610382 | ||
|
Full text: |
||
|
The development and evaluation of Information Retrieval and Recommender Systems has traditionally focused on the relevance and accuracy of retrieved documents and recommendations, respectively. However, there is an increasing realization that accuracy ...
expand
|
||
| Enrichment of user profiles across multiple online social networks for volunteerism matching for social enterprise | ||
| Xuemeng Song | ||
| Pages: 1282-1282 | ||
| doi>10.1145/2600428.2610383 | ||
|
Full text: |
||
|
Volunteers are extremely crucial to nonprofit organizations (NPOs) to sustain their continuing operations. On the other hand, many talents are looking for appropriate volunteer opportunities to realize their dreams of making an impact on the world with ...
expand
|
||
| TUTORIAL SESSION: Tutorials | ||
| Choices and constraints: research goals and approaches in information retrieval (part 1) | ||
| Diane Kelly, Filip Radlinski, Jaime Teevan | ||
| Pages: 1283-1283 | ||
| doi>10.1145/2600428.2602289 | ||
|
Full text: |
||
|
All research projects begin with a goal, for instance to describe search behavior, to predict when a person will enter a second query, or to discover which IR system performs the best. Different research goals suggest different research approaches, ranging ...
expand
|
||
| Choices and constraints: research goals and approaches in information retrieval (part 2) | ||
| Diane Kelly, Filip Radlinski, Jaime Teevan | ||
| Pages: 1284-1284 | ||
| doi>10.1145/2600428.2602290 | ||
|
Full text: |
||
|
All research projects begin with a goal, for instance to describe search behavior, to predict when a person will enter a second query, or to discover which IR system performs the best. Different research goals suggest different research approaches, ranging ...
expand
|
||
| Scalability and efficiency challenges in large-scale web search engines | ||
| B. Barla Cambazoglu, Ricardo Baeza-Yates | ||
| Pages: 1285-1285 | ||
| doi>10.1145/2600428.2602291 | ||
|
Full text: |
||
|
Large-scale web search engines rely on massive compute infrastructures to be able to cope with the continuous growth of the Web and their user bases. In such search engines, achieving scalability and efficiency requires making careful architectural design ...
expand
|
||
| Statistical significance testing in information retrieval: theory and practice | ||
| Ben Carterette | ||
| Pages: 1286-1286 | ||
| doi>10.1145/2600428.2602292 | ||
|
Full text: |
||
|
The past 20 years have seen a great improvement in the rigor of information retrieval experimentation, due primarily to two factors: high-quality, public, portable test collections such as those produced by TREC (the Text REtrieval Con- ference [2]), ...
expand
|
||
| Speech search: techniques and tools for spoken content retrieval | ||
| Gareth J.F. Jones | ||
| Pages: 1287-1287 | ||
| doi>10.1145/2600428.2602293 | ||
|
Full text: |
||
| Axiomatic analysis and optimization of information retrieval models | ||
| Hui Fang, ChengXiang Zhai | ||
| Pages: 1288-1288 | ||
| doi>10.1145/2600428.2602294 | ||
|
Full text: |
||
|
Axiomatic approach provides a systematic way to think about heuristics, identify the weakness of existing methods, and optimize the existing methods accordingly. This tutorial aims to promote axiomatic thinking that can benefit not only the study of ...
expand
|
||
| A general account of effectiveness metrics for information tasks: retrieval, filtering, and clustering | ||
| Enrique Amigó, Julio Gonzalo, Stefano Mizzaro | ||
| Pages: 1289-1289 | ||
| doi>10.1145/2600428.2602296 | ||
|
Full text: |
||
|
In this tutorial we will present, review, and compare the most popular evaluation metrics for some of the most salient information related tasks, covering: (i) Information Retrieval, (ii) Clustering, and (iii) Filtering. The tutorial will make a special ...
expand
|
||
| Dynamic information retrieval modeling | ||
| Hui Yang, Marc Sloan, Jun Wang | ||
| Pages: 1290-1290 | ||
| doi>10.1145/2600428.2602297 | ||
|
Full text: |
||
|
Dynamic aspects of Information Retrieval (IR), including changes found in data, users and systems, are increasingly being utilized in search engines and information filtering systems. Existing IR techniques are limited in their ability to optimize over ...
expand
|
||
| The retrievability of documents | ||
| Leif Azzopardi | ||
| Pages: 1291-1291 | ||
| doi>10.1145/2600428.2602298 | ||
|
Full text: |
||
|
Retrievability is an important and interesting indicator that can be used in a number of ways to analyse Information Retrieval systems and document collections. Rather than focusing totally on relevance, retrievability examines what is retrieved, how ...
expand
|
||
| WORKSHOP SESSION: Workshops | ||
| ERD'14: entity recognition and disambiguation challenge | ||
| David Carmel, Ming-Wei Chang, Evgeniy Gabrilovich, Bo-June (Paul) Hsu, Kuansan Wang | ||
| Pages: 1292-1292 | ||
| doi>10.1145/2600428.2600734 | ||
|
Full text: |
||
| SIGIR 2014 workshop on gathering efficient assessments of relevance (GEAR) | ||
| Martin Halvey, Robert Villa, Paul Clough | ||
| Pages: 1293-1293 | ||
| doi>10.1145/2600428.2600735 | ||
|
Full text: |
||
|
Evaluation is a fundamental part of Information Retrieval, and in the conventional Cranfield evaluation paradigm, sets of relevance assessments are a fundamental part of test collections. This workshop revisits how relevance assessments can be efficiently ...
expand
|
||
| MedIR14: medical information retrieval workshop | ||
| Lorraine Goeuriot, Gareth J.F. Jones, Liadh Kelly, Henning Müller, Justin Zobel | ||
| Pages: 1294-1294 | ||
| doi>10.1145/2600428.2600736 | ||
|
Full text: |
||
|
Medical information is accessible from diverse sources including the general web, social media, journal articles, and hospital records; information searchers can be patients and their families, researchers, practitioners and clinicians. Challenges in ...
expand
|
||
| Privacy-preserving IR: when information retrieval meets privacy and security | ||
| Luo Si, Hui Yang | ||
| Pages: 1295-1295 | ||
| doi>10.1145/2600428.2600737 | ||
|
Full text: |
||
|
Information retrieval (IR) and information privacy/security are two fast-growing computer science disciplines. There are many synergies and connections between these two disciplines. However, there have been very limited efforts to connect the two important ...
expand
|
||
| SIGIR 2014 workshop on semantic matching in information retrieval | ||
| Julio Gonzalo, Hang Li, Alessandro Moschitti, Jun Xu | ||
| Pages: 1296-1296 | ||
| doi>10.1145/2600428.2600738 | ||
|
Full text: |
||
|
Recently, significant progress has been made in research on what we call semantic matching (SM), in web search, question answering, online advertisement, cross-language information retrieval, and other tasks. Advanced technologies based on machine learning ...
expand
|
||
| SoMeRA 2014: social media retrieval and analysis workshop | ||
| Markus Schedl, Peter Knees, Jialie Shen | ||
| Pages: 1297-1297 | ||
| doi>10.1145/2600428.2600739 | ||
|
Full text: |
||
|
The SoMeRA workshop targets cutting edge research from all fields of retrieval, recommendation, and browsing in social media, as well as the analysis of user's multifaceted traces therein. Submissions to the workshop cover a broad range of topics including ...
expand
|
||
| SIGIR 2014 workshop on temporal, social and spatially-aware information access (#TAIA2014) | ||
| Fernando Diaz, Claudia Hauff, Vanessa Murdock, Maarten de Rijke, Milad Shokouhi | ||
| Pages: 1298-1298 | ||
| doi>10.1145/2600428.2600740 | ||
|
Full text: |
||
Welcome to SIGIR, the 37th annual international ACM conference on research and development in Information Retrieval, the premier international conference in the area. We acknowledge all those who submitted papers to the conference and gave the program committee an opportunity to evaluate their work for potential inclusion in the program. Huge thanks are owed to the 57 Area Chairs and 244 general program committee members, who represent 31 countries and almost 200 institutions, for their dedicated work in evaluating the submissions.
The conference received 387 full paper submissions (6% increase over last year) of which 82 (21%) were accepted. This constitutes a slight increase on the acceptance rate over last year. The top five countries in terms of accepted papers (taking all author affiliations of each paper equally into account) were the U.S.A. (36%), China (18%), Singapore (9%), Israel (4%), and The Netherlands (3%). The coverage of accepted papers across topics is as follows: Document Representation and Content Analysis (13%), Queries and Query Analysis (16%), Users and Interactive IR (17%), Retrieval Models and Ranking (9%), Search Engine Architectures and Scalability (8%), Filtering and Recommending (8%), Evaluation (5%), Web IR and Social Media Search (13%), IR and Structured Data (1%), Multimedia IR (5%), Other Applications (5%).
As has been customary for many years, SIGIR 2014 employed a two-tier double-blind review process. At least three reviewers reviewed each paper, and then the Primary Area Chair of the paper led a discussion, based on which (s)he wrote a meta-review. The Secondary Area Chair assigned to each paper double-checked the reviews and discussion, and in some cases, provided an additional review. The PC chairs ranked the submitted papers by the meta-review score and then by the average score of the three reviewer scores, carefully examined the reviews and associated discussion. The PC chairs first identified "clear accepts" and "clear rejects". Then the undecided papers were carefully discussed in a face-to-face PC meeting held in Amsterdam, which involved all available Area Chairs. We would like to heartily thank Jaap Kamps for his assistance in organizing the Amsterdam meeting (even mindfully keeping us on track at times!), providing necessary support materials including a dinner, all of which was greatly appreciated by those who attended the meeting.
The short papers track received 263 submissions (3% increase over last year) and accepted 104 of them (40% acceptance compared to 34% last year). For the second year, the short papers are four pages long. Deep thanks to Vanessa Murdock, Gabriella Pasi and Andrew Turpin, the short paper track chairs, for their very hard and sustained work in arranging and managing a thorough review process for so many papers.
The tutorial track received 17 submissions, 16 of which underwent review. Seven proposals were accepted. Thanks to Falk Scholer for managing the review process which has ensured a diverse tutorial program of excellent quality.
We received 11 workshop proposals, each of which was peer-reviewed by three members of the Workshops PC, and after discussion of all submissions in the Workshops PC, the final decisions were made in consultation with the PC Chairs of the technical program. A total of 7 workshops were accepted (a 64% acceptance rate). We thank Jaap Kamps and Gabriella Kazai for their efforts in producing a great set of workshops.
The Demos track received 34 submissions of which 16 were accepted. Many thanks to Paul Thomas for chairing this track and providing us with an array of interesting demos to look at.
The Doctoral Consortium received 11 submissions this year of which 8 were ultimately acceptances. We are grateful to Shane Culpepper and Grace Hui Yang for chairing the consortium.
This year's industry event is the SIGIR Symposium on IR in Practice (SIRIP 2014) co-chaired by Isabelle Moulinier and David Hawking. Their efforts in organizing SIRIP are gratefully acknowledged. In addition to abstracts of invited talks by industry researchers, and practitioners and consumers of IR, the SIRIP 2014 proceedings include refereed research papers with a strong industry focus (four pages). The proceedings are published separately in electronic-only format.
Finally, special thanks to Mounia Lalmas for chairing the panel to select the Best Paper and Best Student Paper.
This year, SIGIR has the enormous honor of hosting an ACM-W Athena award lecture to be delivered by Susan Dumais. This prestigious award celebrates women researchers who have made fundamental contributions to Computer Science.
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
|
Tools and Resources
Share: |
|||||||||||||
| SESSION: Keynote address | ||
| Riding the multimedia big data wave | ||
| John R. Smith | ||
| Pages: 1-2 | ||
| doi>10.1145/2484028.2494492 | ||
|
Full text: |
||
|
In this talk we present a perspective across multiple industry problems, including safety and security, medical, Web, social and mobile media, and motivate the need for large-scale analysis and retrieval of multimedia data. We describe a multi-layer ...
expand
|
||
| SESSION: User behaviour | ||
| Beliefs and biases in web search | ||
| Ryen White | ||
| Pages: 3-12 | ||
| doi>10.1145/2484028.2484053 | ||
|
Full text: |
||
|
People's beliefs, and unconscious biases that arise from those beliefs, influence their judgment, decision making, and actions, as is commonly accepted among psychologists. Biases can be observed in information retrieval in situations where searchers ...
expand
|
||
| Improving search result summaries by using searcher behavior data | ||
| Mikhail Ageev, Dmitry Lagun, Eugene Agichtein | ||
| Pages: 13-22 | ||
| doi>10.1145/2484028.2484093 | ||
|
Full text: |
||
|
Query-biased search result summaries, or "snippets", help users decide whether a result is relevant for their information need, and have become increasingly important for helping searchers with difficult or ambiguous search tasks. Previously published ...
expand
|
||
| How query cost affects search behavior | ||
| Leif Azzopardi, Diane Kelly, Kathy Brennan | ||
| Pages: 23-32 | ||
| doi>10.1145/2484028.2484049 | ||
|
Full text: |
||
|
affects how users interact with a search system. Microeconomic theory is used to generate the cost-interaction hypothesis that states as the cost of querying increases, users will pose fewer queries and examine more documents per query. A between-subjects ...
expand
|
||
| Search engine switching detection based on user personal preferences and behavior patterns | ||
| Denis Savenkov, Dmitry Lagun, Qiaoling Liu | ||
| Pages: 33-42 | ||
| doi>10.1145/2484028.2484099 | ||
|
Full text: |
||
|
Sometimes, during a search task users may switch from one search engine to another for several reasons, e.g., dissatisfaction with the current search results or desire for broader topic coverage. Detecting the fact of switching is difficult but important ...
expand
|
||
| SESSION: Social media and network analysis I | ||
| Emerging topic detection for organizations from microblogs | ||
| Yan Chen, Hadi Amiri, Zhoujun Li, Tat-Seng Chua | ||
| Pages: 43-52 | ||
| doi>10.1145/2484028.2484057 | ||
|
Full text: |
||
|
Microblog services have emerged as an essential way to strengthen the communications among individuals and organizations. These services promote timely and active discussions and comments towards products, markets as well as public events, and have attracted ...
expand
|
||
| Pseudo test collections for training and tuning microblog rankers | ||
| Richard Berendsen, Manos Tsagkias, Wouter Weerkamp, Maarten de Rijke | ||
| Pages: 53-62 | ||
| doi>10.1145/2484028.2484063 | ||
|
Full text: |
||
|
Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting ...
expand
|
||
| Learning latent friendship propagation networks with interest awareness for link prediction | ||
| Jun Zhang, Chaokun Wang, Philip S. Yu, Jianmin Wang | ||
| Pages: 63-72 | ||
| doi>10.1145/2484028.2484029 | ||
|
Full text: |
||
|
It's well known that the transitivity of friendship is a popular sociological principle in social networks. However, it's still unknown that to what extent people's friend-making behaviors follow this principle and to what extent it can benefit the link ...
expand
|
||
| An experimental study on implicit social recommendation | ||
| Hao Ma | ||
| Pages: 73-82 | ||
| doi>10.1145/2484028.2484059 | ||
|
Full text: |
||
|
Social recommendation problems have drawn a lot of attention recently due to the prevalence of social networking sites. The experiments in previous literature suggest that social information is very effective in improving traditional recommendation algorithms. ...
expand
|
||
| SESSION: Queries I | ||
| Task-aware query recommendation | ||
| Henry Feild, James Allan | ||
| Pages: 83-92 | ||
| doi>10.1145/2484028.2484069 | ||
|
Full text: |
||
|
When generating query recommendations for a user, a natural approach is to try and leverage not only the user's most recently submitted query, or reference query, but also information about the current search context, such as the user's recent search ...
expand
|
||
| Extracting query facets from search results | ||
| Weize Kong, James Allan | ||
| Pages: 93-102 | ||
| doi>10.1145/2484028.2484097 | ||
|
Full text: |
||
|
Web search queries are often ambiguous or multi-faceted, which makes a simple ranked list of results inadequate. To assist information finding for such faceted queries, we explore a technique that explicitly represents interesting facets of a query using ...
expand
|
||
| Learning to personalize query auto-completion | ||
| Milad Shokouhi | ||
| Pages: 103-112 | ||
| doi>10.1145/2484028.2484076 | ||
|
Full text: |
||
|
Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes ...
expand
|
||
| Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval | ||
| Parvaz Mahdabi, Shima Gerani, Jimmy Xiangji Huang, Fabio Crestani | ||
| Pages: 113-122 | ||
| doi>10.1145/2484028.2484056 | ||
|
Full text: |
||
|
Patent prior art search is a task in patent retrieval where the goal is to rank documents which describe prior art work related to a patent application. One of the main properties of patent retrieval is that the query topic is a full patent application ...
expand
|
||
| SESSION: Users and interactive IR I | ||
| Aggregated search interface preferences in multi-session search tasks | ||
| Marc Bron, Jasmijn van Gorp, Frank Nack, Lotte Belice Baltussen, Maarten de Rijke | ||
| Pages: 123-132 | ||
| doi>10.1145/2484028.2484050 | ||
|
Full text: |
||
|
Aggregated search interfaces provide users with an overview of results from various sources. Two general types of display exist: tabbed, with access to each source in a separate tab, and blended, which combines multiple sources into a single result page. ...
expand
|
||
| An effective implicit relevance feedback technique using affective, physiological and behavioural features | ||
| Yashar Moshfeghi, Joemon M. Jose | ||
| Pages: 133-142 | ||
| doi>10.1145/2484028.2484074 | ||
|
Full text: |
||
|
The effectiveness of various behavioural signals for implicit relevance feedback models has been exhaustively studied. Despite the advantages of such techniques for a real time information retrieval system, most of the behavioural signals are noisy and ...
expand
|
||
| How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search | ||
| Jiepu Jiang, Wei Jeng, Daqing He | ||
| Pages: 143-152 | ||
| doi>10.1145/2484028.2484092 | ||
|
Full text: |
||
|
Voice search offers users with a new search experience: instead of typing, users can vocalize their search queries. However, due to voice input errors (such as speech recognition errors and improper system interruptions), users need to frequently reformulate ...
expand
|
||
| Mining touch interaction data on mobile devices to predict web search result relevance | ||
| Qi Guo, Haojian Jin, Dmitry Lagun, Shuai Yuan, Eugene Agichtein | ||
| Pages: 153-162 | ||
| doi>10.1145/2484028.2484100 | ||
|
Full text: |
||
|
Fine-grained search interactions in the desktop setting, such as mouse cursor movements and scrolling, have been shown valuable for understanding user intent, attention, and their preferences for Web search results. As web search on smart phones and ...
expand
|
||
| SESSION: Efficiency I | ||
| An information-theoretic account of static index pruning | ||
| Ruey-Cheng Chen, Chia-Jung Lee | ||
| Pages: 163-172 | ||
| doi>10.1145/2484028.2484061 | ||
|
Full text: |
||
|
In this paper, we recast static index pruning as a model induction problem under the framework of Kullback's principle of minimum cross-entropy. We show that static index pruning has an approximate analytical solution in the form of convex integer program. ...
expand
|
||
| Document identifier reassignment and run-length-compressed inverted indexes for improved search performance | ||
| Diego Arroyuelo, Senén González, Mauricio Oyarzún, Victor Sepulveda | ||
| Pages: 173-182 | ||
| doi>10.1145/2484028.2484079 | ||
|
Full text: |
||
|
Text search engines are a fundamental tool nowadays. Their efficiency relies on a popular and simple data structure: the inverted indexes. Currently, inverted indexes can be represented very efficiently using index compression schemes. Recent investigations ...
expand
|
||
| Fast document-at-a-time query processing using two-tier indexes | ||
| Cristian Rossi, Edleno S. de Moura, Andre L. Carvalho, Altigran S. da Silva | ||
| Pages: 183-192 | ||
| doi>10.1145/2484028.2484085 | ||
|
Full text: |
||
|
In this paper we present two new algorithms designed to reduce the overall time required to process top-k queries. These algorithms are based on the document-at-a-time approach and modify the best baseline we found in the literature, Blockmax WAND (BMW), ...
expand
|
||
| Faster and smaller inverted indices with treaps | ||
| Roberto Konow, Gonzalo Navarro, Charles L.A. Clarke, Alejandro López-Ortíz | ||
| Pages: 193-202 | ||
| doi>10.1145/2484028.2484088 | ||
|
Full text: |
||
|
We introduce a new representation of the inverted index that performs faster ranked unions and intersections while using less space. Our index is based on the treap data structure, which allows us to intersect/merge the document identifiers while simultaneously ...
expand
|
||
| SESSION: Topic modeling | ||
| An unsupervised topic segmentation model incorporating word order | ||
| Shoaib Jameel, Wai Lam | ||
| Pages: 203-212 | ||
| doi>10.1145/2484028.2484062 | ||
|
Full text: |
||
|
We present a new unsupervised topic discovery model for a collection of text documents. In contrast to the majority of the state-of-the-art topic models, our model does not break the document's structure such as paragraphs and sentences. In addition, ...
expand
|
||
| Semantic hashing using tags and topic modeling | ||
| Qifan Wang, Dan Zhang, Luo Si | ||
| Pages: 213-222 | ||
| doi>10.1145/2484028.2484037 | ||
|
Full text: |
||
|
It is an important research problem to design efficient and effective solutions for large scale similarity search. One popular strategy is to represent data examples as compact binary codes through semantic hashing, which has produced promising results ...
expand
|
||
| Incorporating popularity in topic models for social network analysis | ||
| Youngchul Cha, Bin Bi, Chu-Cheng Hsieh, Junghoo Cho | ||
| Pages: 223-232 | ||
| doi>10.1145/2484028.2484086 | ||
|
Full text: |
||
|
Topic models are used to group words in a text dataset into a set of relevant topics. Unfortunately, when a few words frequently appear in a dataset, the topic groups identified by topic models become noisy because these frequent words repeatedly appear ...
expand
|
||
| Topic hierarchy construction for the organization of multi-source user generated contents | ||
| Xingwei Zhu, Zhao-Yan Ming, Xiaoyan Zhu, Tat-Seng Chua | ||
| Pages: 233-242 | ||
| doi>10.1145/2484028.2484032 | ||
|
Full text: |
||
|
User generated contents (UGCs) carry a huge amount of high quality information. However, the information overload and diversity of UGC sources limit their potential uses. In this research, we propose a framework to organize information from multiple ...
expand
|
||
| SESSION: Users and interactive IR II | ||
| Looking ahead: query preview in exploratory search | ||
| Pernilla Qvarfordt, Gene Golovchinsky, Tony Dunnigan, Elena Agapie | ||
| Pages: 243-252 | ||
| doi>10.1145/2484028.2484084 | ||
|
Full text: |
||
|
Exploratory search is a complex, iterative information seeking activity that involves running multiple queries and finding and examining many documents. We designed a query preview control that visualizes the distribution of newly-retrieved and re-retrieved ...
expand
|
||
| News vertical search: when and what to display to users | ||
| Richard McCreadie, Craig Macdonald, Iadh Ounis | ||
| Pages: 253-262 | ||
| doi>10.1145/2484028.2484080 | ||
|
Full text: |
||
|
News reporting has seen a shift toward fast-paced online reporting in new sources such as social media. Web Search engines that support a news vertical have historically relied upon articles published by major newswire providers when serving news-related ...
expand
|
||
| Toward self-correcting search engines: using underperforming queries to improve search | ||
| Ahmed Hassan, Ryen W. White, Yi-Min Wang | ||
| Pages: 263-272 | ||
| doi>10.1145/2484028.2484043 | ||
|
Full text: |
||
|
Search engines receive queries with a broad range of different search intents. However, they do not perform equally well for all queries. Understanding where search engines perform poorly is critical for improving their performance. In this paper, we ...
expand
|
||
| Fighting search engine amnesia: reranking repeated results | ||
| Milad Shokouhi, Ryen W. White, Paul Bennett, Filip Radlinski | ||
| Pages: 273-282 | ||
| doi>10.1145/2484028.2484075 | ||
|
Full text: |
||
|
Web search engines frequently show the same documents repeatedly for different queries within the same search session, in essence forgetting when the same documents were already shown to users. Depending on previous user interaction with the repeated ...
expand
|
||
| SESSION: Recommender systems | ||
| Addressing cold-start in app recommendation: latent user models constructed from twitter followers | ||
| Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua | ||
| Pages: 283-292 | ||
| doi>10.1145/2484028.2484035 | ||
|
Full text: |
||
|
As a tremendous number of mobile applications (apps) are readily available, users have difficulty in identifying apps that are relevant to their interests. Recommender systems that depend on previous user ratings (i.e., collaborative filtering, or CF) ...
expand
|
||
| A location-based news article recommendation with explicit localized semantic analysis | ||
| Jeong-Woo Son, A-Yeong Kim, Seong-Bae Park | ||
| Pages: 293-302 | ||
| doi>10.1145/2484028.2484064 | ||
|
Full text: |
||
|
The interest of users in handheld devices is strongly related to their location. Therefore, the user location is important, as a user context, for news article recommendation in a mobile environment. This paper proposes a novel news article recommendation ...
expand
|
||
| Opportunity model for e-commerce recommendation: right product; right time | ||
| Jian Wang, Yi Zhang | ||
| Pages: 303-312 | ||
| doi>10.1145/2484028.2484067 | ||
|
Full text: |
||
|
Most of existing e-commerce recommender systems aim to recommend the right product to a user, based on whether the user is likely to purchase or like a product. On the other hand, the effectiveness of recommendations also depends on the time of the recommendation. ...
expand
|
||
| Improve collaborative filtering through bordered block diagonal form matrices | ||
| Yongfeng Zhang, Min Zhang, Yiqun Liu, Shaoping Ma | ||
| Pages: 313-322 | ||
| doi>10.1145/2484028.2484101 | ||
|
Full text: |
||
|
Collaborative Filtering-based recommendation algorithms have achieved widespread success on the Web, but little work has been performed to investigate appropriate user-item relationship structures of rating matrices. This paper presents a novel and general ...
expand
|
||
| SESSION: Retrieval models and ranking I | ||
| Personalized ranking model adaptation for web search | ||
| Hongning Wang, Xiaodong He, Ming-Wei Chang, Yang Song, Ryen W. White, Wei Chu | ||
| Pages: 323-332 | ||
| doi>10.1145/2484028.2484068 | ||
|
Full text: |
||
|
Search engines train and apply a single ranking model across all users, but searchers' information needs are diverse and cover a broad range of topics. Hence, a single user-independent ranking model is insufficient to satisfy different users' result ...
expand
|
||
| Ranking document clusters using markov random fields | ||
| Fiana Raiber, Oren Kurland | ||
| Pages: 333-342 | ||
| doi>10.1145/2484028.2484042 | ||
|
Full text: |
||
|
An important challenge in cluster-based document retrieval is ranking document clusters by their relevance to the query. We present a novel cluster ranking approach that utilizes Markov Random Fields (MRFs). MRFs enable the integration of various types ...
expand
|
||
| A novel TF-IDF weighting scheme for effective ranking | ||
| Jiaul H. Paik | ||
| Pages: 343-352 | ||
| doi>10.1145/2484028.2484070 | ||
|
Full text: |
||
|
Term weighting schemes are central to the study of information retrieval systems. This article proposes a novel TF-IDF term weighting scheme that employs two different within document term frequency normalizations to capture two different aspects of ...
expand
|
||
| Retrieving documents with mathematical content | ||
| Shahab Kamali, Frank Wm. Tompa | ||
| Pages: 353-362 | ||
| doi>10.1145/2484028.2484083 | ||
|
Full text: |
||
|
Many documents with mathematical content are published on the Web, but conventional search engines that rely on keyword search only cannot fully exploit their mathematical information. In particular, keyword search is insufficient when expressions in ...
expand
|
||
| SESSION: Time | ||
| Time-aware point-of-interest recommendation | ||
| Quan Yuan, Gao Cong, Zongyang Ma, Aixin Sun, Nadia Magnenat- Thalmann | ||
| Pages: 363-372 | ||
| doi>10.1145/2484028.2484030 | ||
|
Full text: |
||
|
The availability of user check-in data in large volume from the rapid growing location based social networks (LBSNs) enables many important location-aware services to users. Point-of-interest (POI) recommendation is one of such services, which is to ...
expand
|
||
| Modeling user's receptiveness over time for recommendation | ||
| Wei Chen, Wynne Hsu, Mong Li Lee | ||
| Pages: 373-382 | ||
| doi>10.1145/2484028.2484047 | ||
|
Full text: |
||
|
Existing recommender systems model user interests and the social influences independently. In reality, user interests may change over time, and as the interests change, new friends may be added while old friends grow apart and the new friendships formed ...
expand
|
||
| Query representation for cross-temporal information retrieval | ||
| Miles Efron | ||
| Pages: 383-392 | ||
| doi>10.1145/2484028.2484054 | ||
|
Full text: |
||
|
This paper addresses the problem of long-term language change in information retrieval (IR) systems. IR research has often ignored lexical drift. But in the emerging domain of massive digitized book collections, the risk of vocabulary mismatch due to ...
expand
|
||
| SESSION: Evaluation I | ||
| On the measurement of test collection reliability | ||
| Julián Urbano, Mónica Marrero, Diego Martín | ||
| Pages: 393-402 | ||
| doi>10.1145/2484028.2484038 | ||
|
Full text: |
||
|
The reliability of a test collection is proportional to the number of queries it contains. But building a collection with many queries is expensive, so researchers have to find a balance between reliability and cost. Previous work on the measurement ...
expand
|
||
| Deciding on an adjustment for multiplicity in IR experiments | ||
| Leonid Boytsov, Anna Belova, Peter Westfall | ||
| Pages: 403-412 | ||
| doi>10.1145/2484028.2484034 | ||
|
Full text: |
||
|
We evaluate statistical inference procedures for small-scale IR experiments that involve multiple comparisons against the baseline. These procedures adjust for multiple comparisons by ensuring that the probability of observing at least one false positive ...
expand
|
||
| Preference based evaluation measures for novelty and diversity | ||
| Praveen Chandar, Ben Carterette | ||
| Pages: 413-422 | ||
| doi>10.1145/2484028.2484094 | ||
|
Full text: |
||
|
Novel and diverse document ranking is an effective strategy that involves reducing redundancy in a ranked list to maximize the amount of novel and relevant information available to users. Evaluation for novelty and diversity typically involves an assessor ...
expand
|
||
| SESSION: Multimedia | ||
| Competence-based song recommendation | ||
| Lidan Shou, Kuang Mao, Xinyuan Luo, Ke Chen, Gang Chen, Tianlei Hu | ||
| Pages: 423-432 | ||
| doi>10.1145/2484028.2484048 | ||
|
Full text: |
||
|
Singing is a popular social activity and a good way of expressing one's feelings. One important reason for unsuccessful singing performance is because the singer fails to choose a suitable song. In this paper, we propose a novel singing competence-based ...
expand
|
||
| A low rank structural large margin method for cross-modal ranking | ||
| Xinyan Lu, Fei Wu, Siliang Tang, Zhongfei Zhang, Xiaofei He, Yueting Zhuang | ||
| Pages: 433-442 | ||
| doi>10.1145/2484028.2484039 | ||
|
Full text: |
||
|
Cross-modal retrieval is a classic research topic in multimedia information retrieval. The traditional approaches study the problem as a pairwise similarity function problem. In this paper, we consider this problem from a new perspective as a listwise ...
expand
|
||
| Learning to name faces: a multimodal learning scheme for search-based face annotation | ||
| Dayong Wang, Steven C.H. Hoi, Pengcheng Wu, Jianke Zhu, Ying He, Chunyan Miao | ||
| Pages: 443-452 | ||
| doi>10.1145/2484028.2484040 | ||
|
Full text: |
||
|
Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm ...
expand
|
||
| SESSION: Search sessions | ||
| Utilizing query change for session search | ||
| Dongyi Guan, Sicong Zhang, Hui Yang | ||
| Pages: 453-462 | ||
| doi>10.1145/2484028.2484055 | ||
|
Full text: |
||
|
Session search is the Information Retrieval (IR) task that performs document retrieval for a search session. During a session, a user constantly modifies queries in order to find relevant documents that fulfill the information need. This paper proposes ...
expand
|
||
| Toward whole-session relevance: exploring intrinsic diversity in web search | ||
| Karthik Raman, Paul N. Bennett, Kevyn Collins-Thompson | ||
| Pages: 463-472 | ||
| doi>10.1145/2484028.2484089 | ||
|
Full text: |
||
|
Current research on web search has focused on optimizing and evaluating single queries. However, a significant fraction of user queries are part of more complex tasks [20] which span multiple queries across one or more search sessions [26,24]. An ideal ...
expand
|
||
| Summaries, ranked retrieval and sessions: a unified framework for information access evaluation | ||
| Tetsuya Sakai, Zhicheng Dou | ||
| Pages: 473-482 | ||
| doi>10.1145/2484028.2484031 | ||
|
Full text: |
||
|
We introduce a general information access evaluation framework that can potentially handle summaries, ranked document lists and even multi query sessions seamlessly. Our framework first builds a trailtext which represents a concatenation of all ...
expand
|
||
| SESSION: Click models | ||
| Modeling click-through based word-pairs for web search | ||
| Jagadeesh Jagarlamudi, Jianfeng Gao | ||
| Pages: 483-492 | ||
| doi>10.1145/2484028.2484082 | ||
|
Full text: |
||
|
Statistical translation models and latent semantic analysis (LSA) are two effective approaches to exploiting click-through data for Web search ranking. While the former learns semantic relationships between query terms and document terms directly, the ...
expand
|
||
| Click model-based information retrieval metrics | ||
| Aleksandr Chuklin, Pavel Serdyukov, Maarten de Rijke | ||
| Pages: 493-502 | ||
| doi>10.1145/2484028.2484071 | ||
|
Full text: |
||
|
In recent years many models have been proposed that are aimed at predicting clicks of web search users. In addition, some information retrieval evaluation metrics have been built on top of a user model. In this paper we bring these two directions together ...
expand
|
||
| Incorporating vertical results into search click models | ||
| Chao Wang, Yiqun Liu, Min Zhang, Shaoping Ma, Meihong Zheng, Jing Qian, Kuo Zhang | ||
| Pages: 503-512 | ||
| doi>10.1145/2484028.2484036 | ||
|
Full text: |
||
|
In modern search engines, an increasing number of search result pages (SERPs) are federated from multiple specialized search engines (called verticals, such as Image or Video). As an effective approach to interpret users' click-through behavior as feedback ...
expand
|
||
| SESSION: Social media and network analysis II | ||
| Personalized time-aware tweets summarization | ||
| Zhaochun Ren, Shangsong Liang, Edgar Meij, Maarten de Rijke | ||
| Pages: 513-522 | ||
| doi>10.1145/2484028.2484052 | ||
|
Full text: |
||
|
We focus on the problem of selecting meaningful tweets given a user's interests; the dynamic nature of user interests, the sheer volume, and the sparseness of individual messages make this an challenging problem. Specifically, we consider the task of ...
expand
|
||
| Exploiting hybrid contexts for Tweet segmentation | ||
| Chenliang Li, Aixin Sun, Jianshu Weng, Qi He | ||
| Pages: 523-532 | ||
| doi>10.1145/2484028.2484044 | ||
|
Full text: |
||
|
Twitter has attracted hundred millions of users to share and disseminate most up-to-date information. However, the noisy and short nature of tweets makes many applications in information retrieval (IR) and natural language processing (NLP) challenging. ...
expand
|
||
| Sumblr: continuous summarization of evolving tweet streams | ||
| Lidan Shou, Zhenhua Wang, Ke Chen, Gang Chen | ||
| Pages: 533-542 | ||
| doi>10.1145/2484028.2484045 | ||
|
Full text: |
||
|
With the explosive growth of microblogging services, short-text messages (also known as tweets) are being created and shared at an unprecedented rate. Tweets in its raw form can be incredibly informative, but also overwhelming. For both end-users and ...
expand
|
||
| Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow | ||
| Daniel Hasan Dalip, Marcos André Gonçalves, Marco Cristo, Pavel Calado | ||
| Pages: 543-552 | ||
| doi>10.1145/2484028.2484072 | ||
|
Full text: |
||
|
Collaborative web sites, such as collaborative encyclopedias, blogs, and forums, are characterized by a loose edit control, which allows anyone to freely edit their content. As a consequence, the quality of this content raises much concern. To deal with ...
expand
|
||
| SESSION: Queries II | ||
| An incremental approach to efficient pseudo-relevance feedback | ||
| Hao Wu, Hui Fang | ||
| Pages: 553-562 | ||
| doi>10.1145/2484028.2484051 | ||
|
Full text: |
||
|
Pseudo-relevance feedback is an important strategy to improve search accuracy. It is often implemented as a two-round retrieval process: the first round is to retrieve an initial set of documents relevant to an original query, and the second round is ...
expand
|
||
| Query expansion using path-constrained random walks | ||
| Jianfeng Gao, Gu Xu, Jinxi Xu | ||
| Pages: 563-572 | ||
| doi>10.1145/2484028.2484058 | ||
|
Full text: |
||
|
This paper exploits Web search logs for query expansion (QE) by presenting a new QE method based on path-constrained random walks (PCRW), where the search logs are represented as a labeled, directed graph, and the probability of picking an expansion ...
expand
|
||
| Efficient query construction for large scale data | ||
| Elena Demidova, Xuan Zhou, Wolfgang Nejdl | ||
| Pages: 573-582 | ||
| doi>10.1145/2484028.2484078 | ||
|
Full text: |
||
|
In recent years, a number of open databases have emerged on the Web, providing Web users with platforms to collaboratively create structured information. As these databases are intended to accommodate heterogeneous information and knowledge, they usually ...
expand
|
||
| Compact query term selection using topically related text | ||
| K. Tamsin Maxwell, W. Bruce Croft | ||
| Pages: 583-592 | ||
| doi>10.1145/2484028.2484096 | ||
|
Full text: |
||
|
Many recent and highly effective retrieval models for long queries use query reformulation methods that jointly optimize term weights and term selection. These methods learn using word context and global context but typically fail to capture query context. ...
expand
|
||
| SESSION: Diversity | ||
| Sentiment diversification with different biases | ||
| Elif Aktolga, James Allan | ||
| Pages: 593-602 | ||
| doi>10.1145/2484028.2484060 | ||
|
Full text: |
||
|
Prior search result diversification work focuses on achieving topical variety in a ranked list, typically equally across all aspects. In this paper, we diversify with sentiments according to an explicit bias. We want to allow users to switch the result ...
expand
|
||
| Term level search result diversification | ||
| Van Dang, Bruce W. Croft | ||
| Pages: 603-612 | ||
| doi>10.1145/2484028.2484095 | ||
|
Full text: |
||
|
Current approaches for search result diversification have been categorized as either implicit or explicit. The implicit approach assumes each document represents its own topic, and promotes diversity by selecting documents for different topics based ...
expand
|
||
| Search result diversification in resource selection for federated search | ||
| Dzung Hong, Luo Si | ||
| Pages: 613-622 | ||
| doi>10.1145/2484028.2484091 | ||
|
Full text: |
||
|
Prior research in resource selection for federated search mainly focused on selecting a small number of information sources that are most relevant to a user query. However, result novelty and diversification are largely unexplored, which does not reflect ...
expand
|
||
| SESSION: Evaluation II | ||
| The effect of threshold priming and need for cognition on relevance calibration and assessment | ||
| Falk Scholer, Diane Kelly, Wan-Ching Wu, Hanseul S. Lee, William Webber | ||
| Pages: 623-632 | ||
| doi>10.1145/2484028.2484090 | ||
|
Full text: |
||
|
Human assessments of document relevance are needed for the construction of test collections, for ad-hoc evaluation, and for training text classifiers. Showing documents to assessors in different orderings, however, may lead to different assessment outcomes. ...
expand
|
||
| User model-based metrics for offline query suggestion evaluation | ||
| Eugene Kharitonov, Craig Macdonald, Pavel Serdyukov, Iadh Ounis | ||
| Pages: 633-642 | ||
| doi>10.1145/2484028.2484041 | ||
|
Full text: |
||
|
Query suggestion or auto-completion mechanisms are widely used by search engines and are increasingly attracting interest from the research community. However, the lack of commonly accepted evaluation methodology and metrics means that it is not possible ...
expand
|
||
| A general evaluation measure for document organization tasks | ||
| Enrique Amigó, Julio Gonzalo, Felisa Verdejo | ||
| Pages: 643-652 | ||
| doi>10.1145/2484028.2484081 | ||
|
Full text: |
||
|
A number of key Information Access tasks -- Document Retrieval, Clustering, Filtering, and their combinations -- can be seen as instances of a generic {\em document organization} problem that establishes priority and relatedness relationships between ...
expand
|
||
| SESSION: Retrieval models and ranking II | ||
| Modeling term dependencies with quantum language models for IR | ||
| Alessandro Sordoni, Jian-Yun Nie, Yoshua Bengio | ||
| Pages: 653-662 | ||
| doi>10.1145/2484028.2484098 | ||
|
Full text: |
||
|
Traditional information retrieval (IR) models use bag-of-words as the basic representation and assume that some form of independence holds between terms. Representing term dependencies and defining a scoring function capable of integrating such additional ...
expand
|
||
| Copulas for information retrieval | ||
| Carsten Eickhoff, Arjen P. de Vries, Kevyn Collins-Thompson | ||
| Pages: 663-672 | ||
| doi>10.1145/2484028.2484066 | ||
|
Full text: |
||
|
In many domains of information retrieval, system estimates of document relevance are based on multidimensional quality criteria that have to be accommodated in a unidimensional result ranking. Current solutions to this challenge are often inconsistent ...
expand
|
||
| Taily: shard selection using the tail of score distributions | ||
| Robin Aly, Djoerd Hiemstra, Thomas Demeester | ||
| Pages: 673-682 | ||
| doi>10.1145/2484028.2484033 | ||
|
Full text: |
||
|
Search engines can improve their efficiency by selecting only few promising shards for each query. State-of-the-art shard selection algorithms first query a central index of sampled documents, and their effectiveness is similar to searching all shards. ...
expand
|
||
| A mutual information-based framework for the analysis of information retrieval systems | ||
| Peter B. Golbus, Javed A. Aslam | ||
| Pages: 683-692 | ||
| doi>10.1145/2484028.2484073 | ||
|
Full text: |
||
|
We consider the problem of information retrieval evaluation and the methods and metrics used for such evaluations. We propose a probabilistic framework for evaluation which we use to develop new information-theoretic evaluation metrics. We demonstrate ...
expand
|
||
| SESSION: Efficiency II | ||
| The impact of solid state drive on search engine cache management | ||
| Jianguo Wang, Eric Lo, Man Lung Yiu, Jiancong Tong, Gang Wang, Xiaoguang Liu | ||
| Pages: 693-702 | ||
| doi>10.1145/2484028.2484046 | ||
|
Full text: |
||
|
Caching is an important optimization in search engine architectures. Existing caching techniques for search engine optimization are mostly biased towards the reduction of random accesses to disks, because random accesses are known to be much more expensive ...
expand
|
||
| Faster upper bounding of intersection sizes | ||
| Daisuke Takuma, Hiroki Yanagisawa | ||
| Pages: 703-712 | ||
| doi>10.1145/2484028.2484065 | ||
|
Full text: |
||
|
There is a long history of developing efficient algorithms for set intersection, which is a fundamental operation in information retrieval and databases. In this paper, we describe a new data structure, a Cardinality Filter, to quickly compute ...
expand
|
||
| Cache-conscious performance optimization for similarity search | ||
| Maha Alabduljalil, Xun Tang, Tao Yang | ||
| Pages: 713-722 | ||
| doi>10.1145/2484028.2484077 | ||
|
Full text: |
||
|
All-pairs similarity search can be implemented in two stages. The first stage is to partition the data and group potentially similar vectors. The second stage is to run a set of tasks where each task compares a partition of vectors with other candidate ...
expand
|
||
| A candidate filtering mechanism for fast top-k query processing on modern cpus | ||
| Constantinos Dimopoulos, Sergey Nepomnyachiy, Torsten Suel | ||
| Pages: 723-732 | ||
| doi>10.1145/2484028.2484087 | ||
|
Full text: |
||
|
A large amount of research has focused on faster methods for finding top-k results in large document collections, one of the main scalability challenges for web search engines. In this paper, we propose a method for accelerating such top-k queries that ...
expand
|
||
| SESSION: Short Papers 1 -- evaluation | ||
| A test collection for entity search in DBpedia | ||
| Krisztian Balog, Robert Neumayer | ||
| Pages: 737-740 | ||
| doi>10.1145/2484028.2484165 | ||
|
Full text: |
||
|
We develop and make publicly available an entity search test collection based on the DBpedia knowledge base. This includes a large number of queries and corresponding relevance judgments from previous benchmarking campaigns, covering a broad range of ...
expand
|
||
| Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion | ||
| Lei Cen, Eduard C. Dragut, Luo Si, Mourad Ouzzani | ||
| Pages: 741-744 | ||
| doi>10.1145/2484028.2484157 | ||
|
Full text: |
||
|
Entity disambiguation is an important step in many information retrieval applications. This paper proposes new research for entity disambiguation with the focus of name disambiguation in digital libraries. In particular, pairwise similarity is first ...
expand
|
||
| Document features predicting assessor disagreement | ||
| Praveen Chandar, William Webber, Ben Carterette | ||
| Pages: 745-748 | ||
| doi>10.1145/2484028.2484161 | ||
|
Full text: |
||
|
The notion of relevance differs between assessors, thus giving rise to assessor disagreement. Although assessor disagreement has been frequently observed, the factors leading to disagreement are still an open problem. In this paper we study the relationship ...
expand
|
||
| Exploring semi-automatic nugget extraction for Japanese one click access evaluation | ||
| Matthew Ekstrand-Abueg, Virgil Pavlu, Makoto Kato, Tetsuya Sakai, Takehiro Yamamoto, Mayu Iwata | ||
| Pages: 749-752 | ||
| doi>10.1145/2484028.2484153 | ||
|
Full text: |
||
|
Building test collections based on nuggets is useful evaluating systems that return documents, answers, or summaries. However, nugget construction requires a lot of manual work and is not feasible for large query sets. Towards an efficient and ...
expand
|
||
| Report from the NTCIR-10 1CLICK-2 Japanese subtask: baselines, upperbounds and evaluation robustness | ||
| Makoto P. Kato, Tetsuya Sakai, Takehiro Yamamoto, Mayu Iwata | ||
| Pages: 753-756 | ||
| doi>10.1145/2484028.2484117 | ||
|
Full text: |
||
|
The One Click Access Task (1CLICK) of NTCIR requires systems to return a concise multi-document summary of web pages in response to a query which is assumed to have been submitted in a mobile context. Systems are evaluated based on information units ...
expand
|
||
| Building a web test collection using social media | ||
| Chia-Jung Lee, W. Bruce Croft | ||
| Pages: 757-760 | ||
| doi>10.1145/2484028.2484139 | ||
|
Full text: |
||
|
Community Question Answering (CQA) platforms contain a large number of questions and associated answers. Answerers sometimes include URLs as part of the answers to provide further information. This paper describes a novel way of building a test collection ...
expand
|
||
| Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification | ||
| Tetsuya Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Makoto P. Kato, Ruihua Song, Mayu Iwata | ||
| Pages: 761-764 | ||
| doi>10.1145/2484028.2484104 | ||
|
Full text: |
||
|
The NTCIR INTENT task comprises two subtasks: {\em Subtopic Mining}, where systems are required to return a ranked list of {\em subtopic strings} for each given query; and {\em Document Ranking}, where systems are required to return a diversified web ...
expand
|
||
| Is relevance hard work?: evaluating the effort of making relevant assessments | ||
| Robert Villa, Martin Halvey | ||
| Pages: 765-768 | ||
| doi>10.1145/2484028.2484150 | ||
|
Full text: |
||
|
The judging of relevance has been a subject of study in information retrieval for a long time, especially in the creation of relevance judgments for test collections. While the criteria by which assessors? judge relevance has been intensively studied, ...
expand
|
||
| SESSION: Short papers 1 -- filtering and recommending | ||
| A weakly-supervised detection of entity central documents in a stream | ||
| Ludovic Bonnefoy, Vincent Bouvier, Patrice Bellot | ||
| Pages: 769-772 | ||
| doi>10.1145/2484028.2484180 | ||
|
Full text: |
||
|
Filtering a time-ordered corpus for documents that are highly relevant to an entity is a task receiving more and more attention over the years. One application is to reduce the delay between the moment an information about an entity is being first observed ...
expand
|
||
| Sentiment analysis of user comments for one-class collaborative filtering over ted talks | ||
| Nikolaos Pappas, Andrei Popescu-Belis | ||
| Pages: 773-776 | ||
| doi>10.1145/2484028.2484116 | ||
|
Full text: |
||
|
User-generated texts such as reviews, comments or discussions are valuable indicators of users' preferences. Unlike previous works which focus on labeled data from user-contributed reviews, we focus here on user comments which are not accompanied by ...
expand
|
||
| Modeling the uniqueness of the user preferences for recommendation systems | ||
| Haggai Roitman, David Carmel, Yosi Mass, Iris Eiron | ||
| Pages: 777-780 | ||
| doi>10.1145/2484028.2484102 | ||
|
Full text: |
||
|
In this paper we propose a novel framework for modeling the uniqueness of the user preferences for recommendation systems. User uniqueness is determined by learning to what extent the user's item preferences deviate from those of an "average user" in ...
expand
|
||
| Recommending personalized touristic sights using google places | ||
| Maya Sappelli, Suzan Verberne, Wessel Kraaij | ||
| Pages: 781-784 | ||
| doi>10.1145/2484028.2484155 | ||
|
Full text: |
||
|
The purpose of the Contextual Suggestion track, an evaluation task at the TREC 2012 conference, is to suggest personalized tourist activities to an individual, given a certain location and time. In our content-based approach, we collected initial recommendations ...
expand
|
||
| Optimizing top-n collaborative filtering via dynamic negative item sampling | ||
| Weinan Zhang, Tianqi Chen, Jun Wang, Yong Yu | ||
| Pages: 785-788 | ||
| doi>10.1145/2484028.2484126 | ||
|
Full text: |
||
|
Collaborative filtering techniques rely on aggregated user preference data to make personalized predictions. In many cases, users are reluctant to explicitly express their preferences and many recommender systems have to infer them from implicit user ...
expand
|
||
| SESSION: Short papers 1 -- multimedia IR | ||
| Towards retrieving relevant information graphics | ||
| Zhuo Li, Matthew Stagitis, Sandra Carberry, Kathleen F. McCoy | ||
| Pages: 789-792 | ||
| doi>10.1145/2484028.2484164 | ||
|
Full text: |
||
|
Information retrieval research has made significant progress in the retrieval of text documents and images. However, relatively little attention has been given to the retrieval of information graphics (non-pictorial images such as bar charts and line ...
expand
|
||
| Hybrid retrieval approaches to geospatial music recommendation | ||
| Markus Schedl, Dominik Schnitzer | ||
| Pages: 793-796 | ||
| doi>10.1145/2484028.2484146 | ||
|
Full text: |
||
|
Recent advances in music retrieval and recommendation algorithms highlight the necessity to follow multimodal approaches in order to transcend limits imposed by methods that solely use audio, web, or collaborative filtering data. In this paper, we propose ...
expand
|
||
| Leveraging viewer comments for mood classification of music video clips | ||
| Takehiro Yamamoto, Satoshi Nakamura | ||
| Pages: 797-800 | ||
| doi>10.1145/2484028.2484118 | ||
|
Full text: |
||
|
This short paper proposes a method to classify music video clips uploaded to a video sharing service into music mood categories such as 'cheerful,' 'wistful,' and 'aggressive.' The method leverages viewer comments posted to the music video clips for ...
expand
|
||
| SESSION: Short papers 1 -- queries and query analysis | ||
| Exploiting semantics for improving clinical information retrieval | ||
| Atanaz Babashzadeh, Jimmy Huang, Mariam Daoud | ||
| Pages: 801-804 | ||
| doi>10.1145/2484028.2484167 | ||
|
Full text: |
||
|
Clinical information retrieval (IR) presents several challenges including terminology mismatch and granularity mismatch. One of the main objectives in clinical IR is to fill the semantic gap among the queries and documents and go beyond keywords matching. ...
expand
|
||
| Interpretation of coordinations, compound generation, and result fusion for query variants | ||
| Johannes Leveling | ||
| Pages: 805-808 | ||
| doi>10.1145/2484028.2484115 | ||
|
Full text: |
||
|
We investigate interpreting coordinations (e.g. word sequences connected with coordinating conjunctions such as "and" and "or") as logical disjunctions of terms to generate a set of disjunctionfree query variants for information retrieval (IR) queries. ...
expand
|
||
| Time-aware structured query suggestion | ||
| Taiki Miyanishi, Tetsuya Sakai | ||
| Pages: 809-812 | ||
| doi>10.1145/2484028.2484143 | ||
|
Full text: |
||
|
Most commercial search engines have a query suggestion feature, which is designed to capture various possible search intents behind the user's original query. However, even though different search intents behind a given query may have been popular at ...
expand
|
||
| Flat vs. hierarchical phrase-based translation models for cross-language information retrieval | ||
| Ferhan Ture, Jimmy Lin | ||
| Pages: 813-816 | ||
| doi>10.1145/2484028.2484137 | ||
|
Full text: |
||
|
Although context-independent word-based approaches remain popular for cross-language information retrieval, many recent studies have shown that integrating insights from modern statistical machine translation systems can lead to substantial improvements ...
expand
|
||
| Here and there: goals, activities, and predictions about location from geotagged queries | ||
| Robert West, Ryen W. White, Eric Horvitz | ||
| Pages: 817-820 | ||
| doi>10.1145/2484028.2484125 | ||
|
Full text: |
||
|
A significant portion of Web search is performed in mobile settings. We explore the links between users' queries on mobile devices and their locations and movement, with a focus on interpreting queries about addresses. We find that users tend to have ...
expand
|
||
| Query change as relevance feedback in session search | ||
| Sicong Zhang, Dongyi Guan, Hui Yang | ||
| Pages: 821-824 | ||
| doi>10.1145/2484028.2484171 | ||
|
Full text: |
||
|
Session search is the Information Retrieval (IR) task that performs document retrieval for an entire session. During a session, users often change queries to explore and investigate the information needs. In this paper, we propose to use query change ...
expand
|
||
| SESSION: Short papers 1 -- retrieval models and ranking | ||
| Is uncertain logical-matching equivalent to conditional probability? | ||
| Karam Abdulahhad, Jean-Pierre Chevallet, Catherine Berrut | ||
| Pages: 825-828 | ||
| doi>10.1145/2484028.2484152 | ||
|
Full text: |
||
|
Logic-based Information Retrieval (IR) models represent the retrieval decision as a logical implication d->q between a document d and a query q, where d and q are logical sentences. However, d->q is a binary decision, we thus need a measure to ...
expand
|
||
| Boosting novelty for biomedical information retrieval through probabilistic latent semantic analysis | ||
| Xiangdong An, Jimmy Xiangji Huang | ||
| Pages: 829-832 | ||
| doi>10.1145/2484028.2484174 | ||
|
Full text: |
||
|
In information retrieval, we are interested in the information that is not only relevant but also novel. In this paper, we study how to boost novelty for biomedical information retrieval through probabilistic latent semantic analysis. We conduct the ...
expand
|
||
| Learning to combine representations for medical records search | ||
| Nut Limsopatham, Craig Macdonald, Iadh Ounis | ||
| Pages: 833-836 | ||
| doi>10.1145/2484028.2484177 | ||
|
Full text: |
||
|
The complexity of medical terminology raises challenges when searching medical records. For example, 'cancer', 'tumour', and 'neoplasms', which are synonyms, may prevent a traditional search system from retrieving relevant records that contain only synonyms ...
expand
|
||
| Kinship contextualization: utilizing the preceding and following structural elements | ||
| Muhammad A. Norozi, Paavo Arvola | ||
| Pages: 837-840 | ||
| doi>10.1145/2484028.2484111 | ||
|
Full text: |
||
|
The textual context of an element, structurally, contains traces of evidences. Utilizing this context in scoring is called contextualization. In this study we hypothesize that the context of an XML-element originated from its \textit{preceding} ...
expand
|
||
| The cluster hypothesis for entity oriented search | ||
| Hadas Raviv, Oren Kurland, David Carmel | ||
| Pages: 841-844 | ||
| doi>10.1145/2484028.2484128 | ||
|
Full text: |
||
|
In this work we study the cluster hypothesis for entity oriented search (EOS). Specifically, we show that the hypothesis can hold to a substantial extent for several entity similarity measures. We also demonstrate the retrieval effectiveness merits of ...
expand
|
||
| Self reinforcement for important passage retrieval | ||
| Ricardo Ribeiro, Luís Marujo, David Martins de Matos, João P. Neto, Anatole Gershman, Jaime Carbonell | ||
| Pages: 845-848 | ||
| doi>10.1145/2484028.2484134 | ||
|
Full text: |
||
|
In general, centrality-based retrieval models treat all elements of the retrieval space equally, which may reduce their effectiveness. In the specific context of extractive summarization (or important passage retrieval), this means that these models ...
expand
|
||
| What can pictures tell us about web pages?: improving document search using images | ||
| Sergio Rodriguez-Vaamonde, Lorenzo Torresani, Andrew Fitzgibbon | ||
| Pages: 849-852 | ||
| doi>10.1145/2484028.2484144 | ||
|
Full text: |
||
|
Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion ...
expand
|
||
| Estimating query representativeness for query-performance prediction | ||
| Mor Sondak, Anna Shtok, Oren Kurland | ||
| Pages: 853-856 | ||
| doi>10.1145/2484028.2484107 | ||
|
Full text: |
||
|
The query-performance prediction (QPP) task is estimating retrieval effectiveness with no relevance judgments. We present a novel probabilistic framework for QPP that gives rise to an important aspect that was not addressed in previous work; namely, ...
expand
|
||
| Interoperability ranking for mobile applications | ||
| Dragomir Yankov, Pavel Berkhin, Rajen Subba | ||
| Pages: 857-860 | ||
| doi>10.1145/2484028.2484122 | ||
|
Full text: |
||
|
At present, most major app marketplaces perform ranking and recommendation based on search relevance features or marketplace ``popularity'' statistics. For instance, they check similarity between app descriptions and user search queries, or rank-order ...
expand
|
||
| SESSION: Short papers 1 -- social media IR | ||
| Sopra: a new social personalized ranking function for improving web search | ||
| Mohamed Reda Bouadjenek, Hakim Hacid, Mokrane Bouzeghoub | ||
| Pages: 861-864 | ||
| doi>10.1145/2484028.2484131 | ||
|
Full text: |
||
|
We present in this paper a contribution to IR modeling by proposing a new ranking function called SoPRa that considers the social dimension of the Web. This social dimension is any social information that surrounds documents along with the social context ...
expand
|
||
| Browse with a social web directory | ||
| Hao Huang, Yunjun Gao, Lu Chen, Rui Li, Kevin Chiew, Qinming He | ||
| Pages: 865-868 | ||
| doi>10.1145/2484028.2484141 | ||
|
Full text: |
||
|
Browse with either web directories or social bookmarks is an important complementation to search by keywords in web information retrieval. To improve users' browse experiences and facilitate the web directory construction, in this paper, we propose a ...
expand
|
||
| Who will retweet me?: finding retweeters in twitter | ||
| Zhunchen Luo, Miles Osborne, Jintao Tang, Ting Wang | ||
| Pages: 869-872 | ||
| doi>10.1145/2484028.2484158 | ||
|
Full text: |
||
|
An important aspect of communication in Twitter (and other Social Network is message propagation -- people creating posts for others to share. Although there has been work on modelling how tweets in Twitter are propagated (retweeted), an untackled problem ...
expand
|
||
| A financial cost metric for result caching | ||
| Fethi Burak Sazoglu, B. Barla Cambazoglu, Rifat Ozcan, Ismail Sengor Altingovde, Özgür Ulusoy | ||
| Pages: 873-876 | ||
| doi>10.1145/2484028.2484182 | ||
|
Full text: |
||
|
Web search engines cache results of frequent and/or recent queries. Result caching strategies can be evaluated using different metrics, hit rate being the most well-known. Recent works take the processing overhead of queries into account when evaluating ...
expand
|
||
| SESSION: Short papers 1 -- topic models | ||
| Document classification by topic labeling | ||
| Swapnil Hingmire, Sandeep Chougule, Girish K. Palshikar, Sutanu Chakraborti | ||
| Pages: 877-880 | ||
| doi>10.1145/2484028.2484140 | ||
|
Full text: |
||
|
In this paper, we propose Latent Dirichlet Allocation (LDA) [1] based document classification algorithm which does not require any labeled dataset. In our algorithm, we construct a topic model using LDA, assign one topic to one of the class labels, aggregate ...
expand
|
||
| Mining web search topics with diverse spatiotemporal patterns | ||
| Di Jiang, Wilfred Ng | ||
| Pages: 881-884 | ||
| doi>10.1145/2484028.2484124 | ||
|
Full text: |
||
|
Mining the latent topics from web search data and capturing their spatiotemporal patterns have many applications in information retrieval. As web search is heavily influenced by the spatial and temporal factors, the latent topics usually demonstrate ...
expand
|
||
| A novel topic model for automatic term extraction | ||
| Sujian Li, Jiwei Li, Tao Song, Wenjie Li, Baobao Chang | ||
| Pages: 885-888 | ||
| doi>10.1145/2484028.2484106 | ||
|
Full text: |
||
|
Automatic term extraction (ATE) aims at extracting domain-specific terms from a corpus of a certain domain. Termhood is one essential measure for judging whether a phrase is a term. Previous researches on termhood mainly depend on the word frequency ...
expand
|
||
| Improving LDA topic models for microblogs via tweet pooling and automatic labeling | ||
| Rishabh Mehrotra, Scott Sanner, Wray Buntine, Lexing Xie | ||
| Pages: 889-892 | ||
| doi>10.1145/2484028.2484166 | ||
|
Full text: |
||
|
Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic ...
expand
|
||
| SESSION: Short papers 1 -- users and interactive IR | ||
| Extractive summarisation via sentence removal: condensing relevant sentences into a short summary | ||
| Marco Bonzanini, Miguel Martinez-Alvarez, Thomas Roelleke | ||
| Pages: 893-896 | ||
| doi>10.1145/2484028.2484149 | ||
|
Full text: |
||
|
Many on-line services allow users to describe their opinions about a product or a service through a review. In order to help other users to find out the major opinion about a given topic, without the effort to read several reviews, multi-document summarisation ...
expand
|
||
| Characterizing stages of a multi-session complex search task through direct and indirect query modifications | ||
| Jiyin He, Marc Bron, Arjen P. de Vries | ||
| Pages: 897-900 | ||
| doi>10.1145/2484028.2484178 | ||
|
Full text: |
||
|
Search systems use context to effectively satisfy a user's information need as expressed by a query. Tasks are important factors in determining user context during search and many studies have been conducted that identify tasks and task stages through ...
expand
|
||
| Displaying relevance scores for search results | ||
| Guy Shani, Noam Tractinsky | ||
| Pages: 901-904 | ||
| doi>10.1145/2484028.2484112 | ||
|
Full text: |
||
|
Internet search engines typically compute a relevance score for webpages given the query terms, and then rank the pages by decreasing relevance scores. The popular search engines do not, however, present the relevance scores that were computed during ...
expand
|
||
| Studying page life patterns in dynamical web | ||
| Alexey Tikhonov, Ivan Bogatyy, Pavel Burangulov, Liudmila Ostroumova, Vitaliy Koshelev, Gleb Gusev | ||
| Pages: 905-908 | ||
| doi>10.1145/2484028.2484185 | ||
|
Full text: |
||
|
With the ever-increasing speed of content turnover on the web, it is particularly important to understand the patterns that pages' popularity follows. This paper focuses on the dynamical part of the web, i.e. pages that have a limited lifespan and experience ...
expand
|
||
| SESSION: Short papers 2 -- evaluation | ||
| A document rating system for preference judgements | ||
| Maryam Bashir, Jesse Anderton, Jie Wu, Peter B. Golbus, Virgil Pavlu, Javed A. Aslam | ||
| Pages: 909-912 | ||
| doi>10.1145/2484028.2484170 | ||
|
Full text: |
||
|
High quality relevance judgments are essential for the evaluation of information retrieval systems. Traditional methods of collecting relevance judgments are based on collecting binary or graded nominal judgments, but such judgments are limited by factors ...
expand
|
||
| Relevance dimensions in preference-based IR evaluation | ||
| Jinyoung Kim, Gabriella Kazai, Imed Zitouni | ||
| Pages: 913-916 | ||
| doi>10.1145/2484028.2484168 | ||
|
Full text: |
||
|
Evaluation of information retrieval (IR) systems has recently been exploring the use of preference judgments over two search result lists. Unlike the traditional method of collecting relevance labels per single result, this method allows to consider ...
expand
|
||
| Composition of TF normalizations: new insights on scoring functions for ad hoc IR | ||
| François Rousseau, Michalis Vazirgiannis | ||
| Pages: 917-920 | ||
| doi>10.1145/2484028.2484121 | ||
|
Full text: |
||
|
Previous papers in ad hoc IR reported that scoring functions should satisfy a set of heuristic retrieval constraints, providing a mathematical justification for the normalizations historically applied to the term frequency (TF). In this paper, we propose ...
expand
|
||
| The impact of intent selection on diversified search evaluation | ||
| Tetsuya Sakai, Zhicheng Dou, Charles L.A. Clarke | ||
| Pages: 921-924 | ||
| doi>10.1145/2484028.2484105 | ||
|
Full text: |
||
|
To construct a diversified search test collection, a set of possible subtopics (or intents) needs to be determined for each topic, in one way or another, and perintent relevance assessments need to be obtained. In the TREC Web Track Diversity Task, subtopics ...
expand
|
||
| A comparison of the optimality of statistical significance tests for information retrieval evaluation | ||
| Julián Urbano, Mónica Marrero, Diego Martín | ||
| Pages: 925-928 | ||
| doi>10.1145/2484028.2484163 | ||
|
Full text: |
||
|
Previous research has suggested the permutation test as the theoretically optimal statistical significance test for IR evaluation, and advocated for the discontinuation of the Wilcoxon and sign tests. We present a large-scale study comprising nearly ...
expand
|
||
| Assessor disagreement and text classifier accuracy | ||
| William Webber, Jeremy Pickens | ||
| Pages: 929-932 | ||
| doi>10.1145/2484028.2484156 | ||
|
Full text: |
||
|
Text classifiers are frequently used for high-yield retrieval from large corpora, such as in e-discovery. The classifier is trained by annotating example documents for relevance. These examples may, however, be assessed by people other than those whose ...
expand
|
||
| Sequential testing in classifier evaluation yields biased estimates of effectiveness | ||
| William Webber, Mossaab Bagdouri, David D. Lewis, Douglas W. Oard | ||
| Pages: 933-936 | ||
| doi>10.1145/2484028.2484159 | ||
|
Full text: |
||
|
It is common to develop and validate classifiers through a process of repeated testing, with nested training and/or test sets of increasing size. We demonstrate in this paper that such repeated testing leads to biased estimates of classifier effectiveness. ...
expand
|
||
| Relating retrievability, performance and length | ||
| Colin Wilkie, Leif Azzopardi | ||
| Pages: 937-940 | ||
| doi>10.1145/2484028.2484145 | ||
|
Full text: |
||
|
Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be ...
expand
|
||
| SESSION: Short papers 2 -- filtering and recommending | ||
| Cumulative citation recommendation: classification vs. ranking | ||
| Krisztian Balog, Heri Ramampiaro | ||
| Pages: 941-944 | ||
| doi>10.1145/2484028.2484151 | ||
|
Full text: |
||
|
Cumulative citation recommendation refers to the task of filtering a time-ordered corpus for documents that are highly relevant to a predefined set of entities. This task has been introduced at the TREC Knowledge Base Acceleration track in 2012, where ...
expand
|
||
| Tagcloud-based explanation with feedback for recommender systems | ||
| Wei Chen, Wynne Hsu, Mong Li Lee | ||
| Pages: 945-948 | ||
| doi>10.1145/2484028.2484108 | ||
|
Full text: |
||
|
Personalized recommender systems aim to push only the relevant items and information directly to the users without requiring them to browse through millions of web resources. The challenge of these systems is to achieve a high user acceptance rate on ...
expand
|
||
| Collaborative factorization for recommender systems | ||
| Chaosheng Fan, Yanyan Lan, Jiafeng Guo, Zuoquan Lin, Xueqi Cheng | ||
| Pages: 949-953 | ||
| doi>10.1145/2484028.2484176 | ||
|
Full text: |
||
|
Recommender system has become an effective tool for information filtering, which usually provides the most useful items to users by a top-k ranking list. Traditional recommendation techniques such as Nearest Neighbors (NN) and Matrix Factorization (MF) ...
expand
|
||
| RecSys for distributed events: investigating the influence of recommendations on visitor plans | ||
| Richard Schaller, Morgan Harvey, David Elsweiler | ||
| Pages: 953-956 | ||
| doi>10.1145/2484028.2484119 | ||
|
Full text: |
||
|
Distributed events are collections of events taking place within a small area over the same time period and relating to a single topic. There are often a large number of events on offer and the times in which they can be visited are heavily constrained, ...
expand
|
||
| SESSION: Short papers 2 -- multimedia IR | ||
| Ranking-oriented nearest-neighbor based method for automatic image annotation | ||
| Chaoran Cui, Jun Ma, Tao Lian, Xiaofang Wang, Zhaochun Ren | ||
| Pages: 957-960 | ||
| doi>10.1145/2484028.2484113 | ||
|
Full text: |
||
|
Automatic image annotation plays a critical role in keyword-based image retrieval systems. Recently, the nearest-neighbor based scheme has been proposed and achieved good performance for image annotation. Given a new image, the scheme is to first find ...
expand
|
||
| Linking transcribed conversational speech | ||
| Joseph Malionek, Douglas W. Oard, Abhijeet Sangwan, John H.L. Hansen | ||
| Pages: 961-964 | ||
| doi>10.1145/2484028.2484136 | ||
|
Full text: |
||
|
As large collections of historically significant recorded speech become increasingly available, scholars are faced with the challenge of making sense of what they hear. This paper proposes automatically linking conversational speech to related resources ...
expand
|
||
| On contextual photo tag recommendation | ||
| Philip J. McParlane, Yashar Moshfeghi, Joemon M. Jose | ||
| Pages: 965-968 | ||
| doi>10.1145/2484028.2484160 | ||
|
Full text: |
||
|
Image tagging is a growing application on social media websites, however, the performance of many auto-tagging methods are often poor. Recent work has exploited an image's context (e.g. time and location) in the tag recommendation process, where tags ...
expand
|
||
| The knowing camera: recognizing places-of-interest in smartphone photos | ||
| Pai Peng, Lidan Shou, Ke Chen, Gang Chen, Sai Wu | ||
| Pages: 969-972 | ||
| doi>10.1145/2484028.2484173 | ||
|
Full text: |
||
|
This paper presents a framework called Knowing Camera for real-time recognizing places-of-interest in smartphone photos, with the availability of online geotagged images of such places. We propose a probabilistic field-of-view model which captures the ...
expand
|
||
| SESSION: Short papers 2 -- queries and query analysis | ||
| Question retrieval with user intent | ||
| Long Chen, Dell Zhang, Mark Levene | ||
| Pages: 973-976 | ||
| doi>10.1145/2484028.2484129 | ||
|
Full text: |
||
|
Community Question Answering (CQA) services, such as Yahoo! Answers and WikiAnswers, have become popular with users as one of the central paradigms for satisfying users' information needs. The task of question retrieval in CQA aims to resolve one's query ...
expand
|
||
| Mapping queries to questions: towards understanding users' information needs | ||
| Yunjun Gao, Lu Chen, Rui Li, Gang Chen | ||
| Pages: 977-980 | ||
| doi>10.1145/2484028.2484138 | ||
|
Full text: |
||
|
In this paper, for the first time, we study the problem of mapping keyword queries to questions on community-based question answering (CQA) sites. Mapping general web queries to questions enables search engines not only to discover explicit and specific ...
expand
|
||
| From keywords to keyqueries: content descriptors for the web | ||
| Tim Gollub, Matthias Hagen, Maximilian Michel, Benno Stein | ||
| Pages: 981-984 | ||
| doi>10.1145/2484028.2484181 | ||
|
Full text: |
||
|
We introduce the concept of keyqueries as dynamic content descriptors for documents. Keyqueries are defined implicitly by the index and the retrieval model of a reference search engine: keyqueries for a document are the minimal queries that return the ...
expand
|
||
| Commodity query by snapping | ||
| Hao Huang, Yunjun Gao, Kevin Chiew, Qinming He, Lu Chen | ||
| Pages: 985-988 | ||
| doi>10.1145/2484028.2484120 | ||
|
Full text: |
||
|
Commodity information such as prices and public reviews is always the concern of consumers. Helping them conveniently acquire these information as an instant reference is often of practical significance for their purchase activities. Nowadays, Web 2.0, ...
expand
|
||
| Temporal variance of intents in multi-faceted event-driven information needs | ||
| Stewart Whiting, Ke Zhou, Joemon Jose, Mounia Lalmas | ||
| Pages: 989-992 | ||
| doi>10.1145/2484028.2484169 | ||
|
Full text: |
||
|
Time is often important for understanding user intent during search activity, especially for information needs related to event-driven topics. Diversity for multi-faceted information needs ensures that ranked documents optimally cover multiple facets ...
expand
|
||
| Pursuing insights about healthcare utilization via geocoded search queries | ||
| Shuang-Hong Yang, Ryen W. White, Eric Horvitz | ||
| Pages: 993-996 | ||
| doi>10.1145/2484028.2484147 | ||
|
Full text: |
||
|
Mobile devices provide people with a conduit to the rich infor-mation resources of the Web. With consent, the devices can also provide streams of information about search activity and location that can be used in population studies and real-time assistance. ...
expand
|
||
| SESSION: Short papers 2 -- retrieval models and ranking | ||
| Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures | ||
| Nima Asadi, Jimmy Lin | ||
| Pages: 997-1000 | ||
| doi>10.1145/2484028.2484132 | ||
|
Full text: |
||
|
This paper examines a multi-stage retrieval architecture consisting of a candidate generation stage, a feature extraction stage, and a reranking stage using machine-learned models. Given a fixed set of features and a learning-to-rank model, we explore ...
expand
|
||
| Estimating topical context by diverging from external resources | ||
| Romain Deveaud, Eric SanJuan, Patrice Bellot | ||
| Pages: 1001-1004 | ||
| doi>10.1145/2484028.2484148 | ||
|
Full text: |
||
|
Improving query understanding is crucial for providing the user with information that suits her needs. To this end, the retrieval system must be able to deal with several sources of knowledge from which it could infer a topical context. The use of external ...
expand
|
||
| Finding knowledgeable groups in enterprise corpora | ||
| Shangsong Liang, Maarten de Rijke | ||
| Pages: 1005-1008 | ||
| doi>10.1145/2484028.2484109 | ||
|
Full text: |
||
|
The task of finding groups is a natural extension of search tasks aimed at retrieving individual entities. We introduce a group finding task: given a query topic, find knowledgeable groups that have expertise on that topic. We present four general strategies ...
expand
|
||
| Neighbourhood preserving quantisation for LSH | ||
| Sean Moran, Victor Lavrenko, Miles Osborne | ||
| Pages: 1009-1012 | ||
| doi>10.1145/2484028.2484162 | ||
|
Full text: |
||
|
We introduce a scheme for optimally allocating multiple bits per hyperplane for Locality Sensitive Hashing (LSH). Existing approaches binarise LSH projections by thresholding at zero yielding a single bit per dimension. We demonstrate that this is a ...
expand
|
||
| Shame to be sham: addressing content-based grey hat search engine optimization | ||
| Fiana Raiber, Kevyn Collins-Thompson, Oren Kurland | ||
| Pages: 1013-1016 | ||
| doi>10.1145/2484028.2484135 | ||
|
Full text: |
||
|
We present an initial study identifying a form of content-based grey hat search engine optimization, in which a Web page contains both potentially relevant content and manipulated content: we call such pages sham documents, because they lie in the grey ...
expand
|
||
| IRWR: incremental random walk with restart | ||
| Weiren Yu, Xuemin Lin | ||
| Pages: 1017-1020 | ||
| doi>10.1145/2484028.2484114 | ||
|
Full text: |
||
|
Random Walk with Restart (RWR) has become an appealing measure of node proximities in emerging applications \eg recommender systems and automatic image captioning. In practice, a real graph is typically large, and is frequently updated with small changes. ...
expand
|
||
| Bias-variance decomposition of ir evaluation | ||
| Peng Zhang, Dawei Song, Jun Wang, Yuexian Hou | ||
| Pages: 1021-1024 | ||
| doi>10.1145/2484028.2484127 | ||
|
Full text: |
||
|
It has been recognized that, when an information retrieval (IR) system achieves improvement in mean retrieval effectiveness (e.g. mean average precision (MAP)) over all the queries, the performance (e.g., average precision (AP)) of some individual queries ...
expand
|
||
| An adaptive evidence weighting method for medical record search | ||
| Dongqing Zhu, Ben Carterette | ||
| Pages: 1025-1028 | ||
| doi>10.1145/2484028.2484175 | ||
|
Full text: |
||
|
In this paper, we present a medical record search system which is useful for identifying cohorts required in clinical studies. In particular, we propose a query-adaptive weighting method that can dynamically aggregate and score evidence in multiple medical ...
expand
|
||
| Fresh BrowseRank | ||
| Maxim Zhukovskiy, Andrei Khropov, Gleb Gusev, Pavel Serdyukov | ||
| Pages: 1029-1032 | ||
| doi>10.1145/2484028.2484186 | ||
|
Full text: |
||
|
In the last years, a lot of attention was attracted by the problem of page authority computation based on user browsing behavior. However, the proposed methods have a number of limitations. In particular, they run on a single snapshot of a user browsing ...
expand
|
||
| SESSION: Short papers 2 -- social media IR | ||
| Competition-based networks for expert finding | ||
| Çiğdem Aslay, Neil O'Hare, Luca Maria Aiello, Alejandro Jaimes | ||
| Pages: 1033-1036 | ||
| doi>10.1145/2484028.2484183 | ||
|
Full text: |
||
|
Finding experts in question answering platforms has important applications, such as question routing or identification of best answers. Addressing the problem of ranking users with respect to their expertise, we propose Competition-Based Expertise Networks ...
expand
|
||
| A study on the accuracy of Flickr's geotag data | ||
| Claudia Hauff | ||
| Pages: 1037-1040 | ||
| doi>10.1145/2484028.2484154 | ||
|
Full text: |
||
|
Obtaining geographically tagged multimedia items from social Web platforms such as Flickr is beneficial for a variety of applications including the automatic creation of travelogues and personalized travel recommendations. In order to take advantage ...
expand
|
||
| Finding impressive social content creators: searching for SNS illustrators using feedback on motifs and impressions | ||
| Yohei Seki, Kiyoto Miyajima | ||
| Pages: 1041-1044 | ||
| doi>10.1145/2484028.2484133 | ||
|
Full text: |
||
|
We propose a method for finding impressive creators in online social network sites (SNSs). Many users are actively engaged in publishing their own works, sharing visual content on sites such as YouTube or Flickr. In this paper, we focus on the Japanese ...
expand
|
||
| Informational friend recommendation in social media | ||
| Shengxian Wan, Yanyan Lan, Jiafeng Guo, Chaosheng Fan, Xueqi Cheng | ||
| Pages: 1045-1048 | ||
| doi>10.1145/2484028.2484179 | ||
|
Full text: |
||
|
It is well recognized that users rely on social media (e.g. Twitter or Digg) to fulfill two common needs (i.e. social need and informational need) that is to keep in touch with their friends in the real world and to have access to information they are ...
expand
|
||
| SESSION: Short papers 2 -- topic models | ||
| Using social annotations to enhance document representation for personalized search | ||
| Mohamed Reda BOUADJENEK, Hakim Hacid, Mokrane Bouzeghoub, Athena Vakali | ||
| Pages: 1049-1052 | ||
| doi>10.1145/2484028.2484130 | ||
|
Full text: |
||
|
In this paper, we present a contribution to IR modeling. We propose an approach that computes on the fly, a Personalized Social Document Representation (PSDR) of each document per user based on his social activities. The PSDRs are used to rank documents ...
expand
|
||
| The bag-of-repeats representation of documents | ||
| Matthias Gallé | ||
| Pages: 1053-1056 | ||
| doi>10.1145/2484028.2484142 | ||
|
Full text: |
||
|
n-gram representations of documents may improve over a simple bag-of-word representation by relaxing the independence assumption of word and introducing context. However, this comes at a cost of adding features which are non-descriptive, and increasing ...
expand
|
||
| An LDA-smoothed relevance model for document expansion: a case study for spoken document retrieval | ||
| Debasis Ganguly, Johannes Leveling, Gareth J.F. Jones | ||
| Pages: 1057-1060 | ||
| doi>10.1145/2484028.2484110 | ||
|
Full text: |
||
|
Document expansion (DE) in information retrieval (IR) involves modifying each document in the collection by introducing additional terms into the document. It is particularly useful to improve retrieval of short and noisy documents where the additional ...
expand
|
||
| SESSION: Short papers 2 -- users and interactive IR | ||
| Timeline generation with social attention | ||
| Xin Wayne Zhao, Yanwei Guo, Rui Yan, Yulan He, Xiaoming Li | ||
| Pages: 1061-1064 | ||
| doi>10.1145/2484028.2484103 | ||
|
Full text: |
||
|
Timeline generation is an important research task which can help users to have a quick understanding of the overall evolution of any given topic. It thus attracts much attention from research communities in recent years. Nevertheless, existing work on ...
expand
|
||
| Explicit feedback in local search tasks | ||
| Dmitry Lagun, Avneesh Sud, Ryen W. White, Peter Bailey, Georg Buscher | ||
| Pages: 1065-1068 | ||
| doi>10.1145/2484028.2484123 | ||
|
Full text: |
||
|
Modern search engines make extensive use of people's contextual information to finesse result rankings. Using a searcher's location provides an especially strong signal for adjusting results for certain classes of queries where people may have clear ...
expand
|
||
| Ranking explanatory sentences for opinion summarization | ||
| Hyun Duk Kim, Malu G. Castellanos, Meichun Hsu, ChengXiang Zhai, Umeshwar Dayal, Riddhiman Ghosh | ||
| Pages: 1069-1072 | ||
| doi>10.1145/2484028.2484172 | ||
|
Full text: |
||
|
We introduce a novel sentence ranking problem called explanatory sentence extraction (ESE) which aims to rank sentences in opinionated text based on their usefulness for helping users understand the detailed reasons of sentiments (i.e., "explanatoriness"). ...
expand
|
||
| #trapped!: social media search system requirements for emergency management professionals | ||
| Stefan Raue, Leif Azzopardi, Chris W. Johnson | ||
| Pages: 1073-1076 | ||
| doi>10.1145/2484028.2484184 | ||
|
Full text: |
||
|
Social media provides a new and potentially rich source of information for emergency management services. However, extracting the relevant information from such streams poses a number of difficult challenges. In this short paper, we survey emergency ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 1 -- Users and interactive IR | ||
| ThemeStreams: visualizing the stream of themes discussed in politics | ||
| Ork de Rooij, Daan Odijk, Maarten de Rijke | ||
| Pages: 1077-1078 | ||
| doi>10.1145/2484028.2484215 | ||
|
Full text: |
||
|
The political landscape is fluid. Discussions are always ongoing and new "hot topics" continue to appear in the headlines. But what made people start talking about that topic? And who started it? Because of the speed at which discussions sometimes take ...
expand
|
||
| BATC: a benchmark for aggregation techniques in crowdsourcing | ||
| Quoc Viet Hung Nguyen, Thanh Tam Nguyen, Ngoc Tran Lam, Karl Aberer | ||
| Pages: 1079-1080 | ||
| doi>10.1145/2484028.2484199 | ||
|
Full text: |
||
|
As the volumes of AI problems involving human knowledge are likely to soar, crowdsourcing has become essential in a wide range of world-wide-web applications. One of the biggest challenges of crowdsourcing is aggregating the answers collected from crowd ...
expand
|
||
| Spacious: an interactive mental search interface | ||
| Phong D. Vo, Hichem Sahbi | ||
| Pages: 1081-1082 | ||
| doi>10.1145/2484028.2484203 | ||
|
Full text: |
||
|
We introduce in this work a novel approach for semantic indexing and mental image search. Given semantic concepts defined by few training examples, our formulation is transductive and learns a mapping from an initial ambient space, related to low level ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 1 -- IR and structured data | ||
| Flex-BaseX: an XML engine with a flexible extension of Xquery full-text | ||
| Emanuele Panzeri, Gabriella Pasi | ||
| Pages: 1083-1084 | ||
| doi>10.1145/2484028.2484216 | ||
|
Full text: |
||
|
XML is the most used language for structuring data and documents, besides being the de-facto standard for data exchange. Keyword based search has been implemented by the XQuery Full-Text language extension, allowing document fragments to be retrieved ...
expand
|
||
| ProductSeeker: entity-based product retrieval for e-commerce | ||
| Hongzhi Wang, Xiaodong Zhang, Jianzhong Li, Hong Gao | ||
| Pages: 1085-1086 | ||
| doi>10.1145/2484028.2484205 | ||
|
Full text: |
||
|
The retrieval results of online products information in e-commerce web sites are often difficult for users to use because of different descriptions for the same product. This paper proposes ProductSeeker, a product retrieval system organizing results ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 1 -- information extraction | ||
| Live nuggets extractor: a semi-automated system for text extraction and test collection creation | ||
| Matthew Ekstrand-Abueg, Virgil Pavlu, Javed A. Aslam | ||
| Pages: 1087-1088 | ||
| doi>10.1145/2484028.2484211 | ||
|
Full text: |
||
|
The Live Nugget Extractor system provides users with a method of efficiently and accurately collecting relevant information for any web query rather than providing a simple ranked lists of documents. The system utilizes an online learning procedure to ...
expand
|
||
| X-ENS: semantic enrichment of web search results at real-time | ||
| Pavlos Fafalios, Yannis Tzitzikas | ||
| Pages: 1089-1090 | ||
| doi>10.1145/2484028.2484200 | ||
|
Full text: |
||
|
While more and more semantic data are published on the Web, an important question is how typical web users can access and exploit this body of knowledge. Although, existing interaction paradigms in semantic search hide the complexity behind an easy-to-use ...
expand
|
||
| Accurate and robust text detection: a step-in for text retrieval in natural scene images | ||
| Xu-Cheng Yin, Xuwang Yin, Kaizhu Huang, Hong-Wei Hao | ||
| Pages: 1091-1092 | ||
| doi>10.1145/2484028.2484197 | ||
|
Full text: |
||
|
We propose and implement a robust text detection system, which is a prominent step-in for text retrieval in natural scene images or videos. Our system includes several key components: (1) A fast and effective pruning algorithm is designed to extract ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 1 -- filtering and recommending | ||
| A framework for specific term recommendation systems | ||
| Thomas Lüke, Philipp Schaer, Philipp Mayr | ||
| Pages: 1093-1094 | ||
| doi>10.1145/2484028.2484207 | ||
|
Full text: |
||
|
In this paper we present the IRSA framework that enables the automatic creation of search term suggestion or recommendation systems (TS). Such TS are used to operationalize interactive query expansion and help users in refining their information need ...
expand
|
||
| TweetMogaz: a news portal of tweets | ||
| Walid Magdy | ||
| Pages: 1095-1096 | ||
| doi>10.1145/2484028.2484212 | ||
|
Full text: |
||
|
Twitter is currently one of the largest social hubs for users to spread and discuss news. For most of the top news stories happening, there are corresponding discussions on social media. In this demonstration TweetMogaz is presented, which is a platform ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 1 -- classification and clustering | ||
| InfoLand: information lay-of-land for session search | ||
| Jiyun Luo, Dongyi Guan, Hui Yang | ||
| Pages: 1097-1098 | ||
| doi>10.1145/2484028.2484213 | ||
|
Full text: |
||
|
Search result clustering (SRC) is a post-retrieval process that hierarchically organizes search results. The hierarchical structure offers overview for the search results and displays an "information lay-of-land" that intents to guide the users throughout ...
expand
|
||
| A portable multilingual medical directory by automatic categorization of Wikipedia articles | ||
| Fernando Ruiz-Rico, María-Consuelo Rubio-Sánchez, David Tomás, Jose-Luis Vicedo | ||
| Pages: 1099-1100 | ||
| doi>10.1145/2484028.2484217 | ||
|
Full text: |
||
|
Wikipedia has become one of the most important sources of information available all over the world. However, the categorization of Wikipedia articles is not standardized and the searches are mainly performed on keywords rather than concepts. In this ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 2 -- users and interactive IR | ||
| A geolinguistic web application based on linked open data | ||
| Emanuele Di Buccio, Giorgio Maria Di Nunzio, Gianmaria Silvello | ||
| Pages: 1101-1102 | ||
| doi>10.1145/2484028.2484219 | ||
|
Full text: |
||
|
Digital Geolinguistic systems encourage collaboration between linguists, historians, archaeologists, ethnographers, as they explore the relationship between language and cultural adaptation and change. In this demo, we propose a Linked Open Data approach ...
expand
|
||
| TopicVis: a GUI for topic-based feedback and navigation | ||
| Debasis Ganguly, Manisha Ganguly, Johannes Leveling, Gareth J.F. Jones | ||
| Pages: 1103-1104 | ||
| doi>10.1145/2484028.2484202 | ||
|
Full text: |
||
|
This paper describes a search system which includes topic model visualization to improve the user search experience. The system graphically renders the topics in a retrieved set of documents, enables a user to selectively refine search results and allows ...
expand
|
||
| Information seeking in digital cultural heritage with PATHS | ||
| Mark M. Hall, Paul D. Clough, Samuel Fernando, Paula Goodale, Mark Stevenson, Eneko Agirre, Arantxa Otegi, Aitor Soroa, Kate Fernie, Jillian Griffiths, Runar Bergheim | ||
| Pages: 1105-1106 | ||
| doi>10.1145/2484028.2484210 | ||
|
Full text: |
||
|
Current Information Retrieval systems for digital cultural heritage support only the actual search aspect of the information seeking process. This demonstration presents the second PATHS system which provides the exploration, analysis, and sense-making ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 2 -- IR and structured data | ||
| Answering natural language queries over linked data graphs: a distributional semantics approach | ||
| André Freitas, Fabrício F. de Faria, Seán O'Riain, Edward Curry | ||
| Pages: 1107-1108 | ||
| doi>10.1145/2484028.2484209 | ||
|
Full text: |
||
|
This paper demonstrates Treo, a natural language query mechanism for Linked Data graphs. The approach uses a distributional semantic vector space model to semantically match user query terms with data, supporting vocabulary-independent (or ...
expand
|
||
| Removing the mismatch headache in XML keyword search | ||
| Yong Zeng, Zhifeng Bao, Tok Wang Ling, Guoliang Li | ||
| Pages: 1109-1110 | ||
| doi>10.1145/2484028.2484218 | ||
|
Full text: |
||
|
In this demo, we study one category of query refinement problems in the context of XML keyword search, where what users search for do not exist in the data while useless results are returned by the search engine. It is a hidden but important problem. ...
expand
|
||
| YaLi: a crowdsourcing plug-in for NERD | ||
| Yafang Wang, Lili Jiang, Johannes Hoffart, Gerhard Weikum | ||
| Pages: 1111-1112 | ||
| doi>10.1145/2484028.2484206 | ||
|
Full text: |
||
|
We demonstrate the YaLi browser plug-in which discovers named entities in Web pages and provides background knowledge about them. The plug-in is implemented with two purposes. From a user perspective, it enriches the browsing experience with entities, ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 2 -- information extraction | ||
| SearchResultFinder: federated search made easy | ||
| Dolf Trieschnigg, Kien Tjin-Kam-Jet, Djoerd Hiemstra | ||
| Pages: 1113-1114 | ||
| doi>10.1145/2484028.2484198 | ||
|
Full text: |
||
|
Building a federated search engine based on a large number existing web search engines is a challenge: implementing the programming interface (API) for each search engine is an exacting and time-consuming job. In this demonstration we present SearchResultFinder, ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 2 -- filtering and recommending | ||
| Online matching of web content to closed captions in IntoNow | ||
| Carlos Castillo, Gianmarco De Francisci Morales, Ajay Shekhawat | ||
| Pages: 1115-1116 | ||
| doi>10.1145/2484028.2484204 | ||
|
Full text: |
||
|
IntoNow is a mobile application that provides a second-screen experience to television viewers. IntoNow uses the microphone of the companion device to sample the audio coming from the TV set, and compares it against a database of TV shows in order to ...
expand
|
||
| Match the news: a firefox extension for real-time news recommendation | ||
| Margarita Karkali, Dimitris Pontikis, Michalis Vazirgiannis | ||
| Pages: 1117-1118 | ||
| doi>10.1145/2484028.2484208 | ||
|
Full text: |
||
|
We present Match the News, a browser extension for real time news recommendation. Our extension works on the client side to recommend in real time recently published articles that are relevant to the web page the user is currently visiting. Match ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations 2 -- classification and clustering | ||
| Demonstration of citation pattern analysis for plagiarism detection | ||
| Bela Gipp, Norman Meuschke, Corinna Breitinger, Mario Lipinski, Andreas Nürnberger | ||
| Pages: 1119-1120 | ||
| doi>10.1145/2484028.2484214 | ||
|
Full text: |
||
| A multilingual and multiplatform application for medicinal plants prescription from medical symptoms | ||
| Fernando Ruiz-Rico, David Tomás, Jose-Luis Vicedo, María-Consuelo Rubio-Sánchez | ||
| Pages: 1121-1122 | ||
| doi>10.1145/2484028.2484201 | ||
|
Full text: |
||
|
This paper presents an application for medicinal plants prescription based on text classification techniques. The system receives as an input a free text describing the symptoms of a user, and retrieves a ranked list of medicinal plants related to those ...
expand
|
||
| TUTORIAL SESSION: Tutorials | ||
| Searching in the city of knowledge: challenges and recent developments | ||
| Veli Bicer, Vanessa Lopez | ||
| Pages: 1123-1123 | ||
| doi>10.1145/2484028.2484195 | ||
|
Full text: |
||
|
Today plenty of data is emerging from various city systems. Beyond the classical Web resources, large amounts of data are retrieved from sensors, devices, social networks, governmental applications, or service networks. In such a diversity of information, ...
expand
|
||
| Scalability and efficiency challenges in commercial web search engines | ||
| B. Barla Cambazoglu, Ricardo Baeza-Yates | ||
| Pages: 1124-1124 | ||
| doi>10.1145/2484028.2484189 | ||
|
Full text: |
||
|
Commercial web search engines rely on very large compute infrastructures to be able to cope with the continuous growth of the Web and user bases. Achieving scalability and efficiency in such large-scale search engines requires making careful architectural ...
expand
|
||
| Music similarity and retrieval | ||
| Peter Knees, Markus Schedl | ||
| Pages: 1125-1125 | ||
| doi>10.1145/2484028.2484193 | ||
|
Full text: |
||
|
This tutorial serves as an introductory course to the field of and state-of-the-art in music information retrieval (MIR) and in particular to music similarity estimation which is an essential component of music retrieval. Apart from explaining approaches ...
expand
|
||
| The cluster hypothesis in information retrieval | ||
| Oren Kurland | ||
| Pages: 1126-1126 | ||
| doi>10.1145/2484028.2484192 | ||
|
Full text: |
||
| Entity linking and retrieval | ||
| Edgar Meij, Krisztian Balog, Daan Odijk | ||
| Pages: 1127-1127 | ||
| doi>10.1145/2484028.2484188 | ||
|
Full text: |
||
|
This full-day tutorial presents a comprehensive introduction to entity linking and retrieval. Part I provides a detailed overview of entity linking: identifying and disambiguating entity occurrences in unstructured text. Part II focuses on entity retrieval, ...
expand
|
||
| Kernel-based learning to rank with syntactic and semantic structures | ||
| Alessandro Moschitti | ||
| Pages: 1128-1128 | ||
| doi>10.1145/2484028.2484196 | ||
|
Full text: |
||
|
Kernel Methods (KMs) are powerful machine learning techniques that can alleviate the data representation problem as they substitute scalar product between feature vectors with similarity functions (kernels) directly defined between data instances, e.g., ...
expand
|
||
| Designing search usability | ||
| Tony Russell-Rose | ||
| Pages: 1129-1129 | ||
| doi>10.1145/2484028.2484191 | ||
|
Full text: |
||
|
Search is not just a box and ten blue links. Search is a journey: an exploration where what we encounter along the way changes what we seek. But in order to guide people along this journey, we must understand both the art and science of search experience ...
expand
|
||
| Diversity and novelty in information retrieval | ||
| Rodrygo L.T. Santos, Pablo Castells, Ismail Sengor Altingovde, Fazli Can | ||
| Pages: 1130-1130 | ||
| doi>10.1145/2484028.2484187 | ||
|
Full text: |
||
|
This tutorial aims to provide a unifying account of current research on diversity and novelty in different IR domains, namely, in the context of search engines, recommender systems, and data streams.
expand
|
||
| Multimedia recommendation: technology and techniques | ||
| Jialie Shen, Meng Wang, Shuicheng Yan, Peng Cui | ||
| Pages: 1131-1131 | ||
| doi>10.1145/2484028.2484194 | ||
|
Full text: |
||
|
In recent years, we have witnessed a rapid growth in the availability of digital multimedia on various application platforms and domains. Consequently, the problem of information overload has become more and more serious. In order to tackle the challenge, ...
expand
|
||
| Building test collections: an interactive tutorial for students and others without their own evaluation conference series | ||
| Ian M. Soboroff | ||
| Pages: 1132-1132 | ||
| doi>10.1145/2484028.2484190 | ||
|
Full text: |
||
|
While existing test collections and evaluation conference efforts may sufficiently support one's research, one can easily find oneself wanting to solve problems no one else is solving yet. But how can research in IR be done (or be published!) without ...
expand
|
||
| WORKSHOP SESSION: Workshops | ||
| Workshop on benchmarking adaptive retrieval and recommender systems: BARS 2013 | ||
| Pablo Castells, Frank Hopfgartner, Alan Said, Mounia Lalmas | ||
| Pages: 1133-1133 | ||
| doi>10.1145/2484028.2484224 | ||
|
Full text: |
||
|
Evaluating adaptive and personalized information retrieval tech-niques is known to be a difficult endeavor. The rapid evolution of novel technologies in this scope raises additional challenges that further stress the need for new evaluation approaches ...
expand
|
||
| SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation | ||
| Charles L.A. Clarke, Luanne Freund, Mark D. Smucker, Emine Yilmaz | ||
| Pages: 1134-1134 | ||
| doi>10.1145/2484028.2484222 | ||
|
Full text: |
||
|
The SIGIR 2013 Workshop on Modeling User Behavior for Information Retrieval Evaluation (MUBE 2013) brings together people to discuss existing and new approaches, ways to collaborate, and other ideas and issues involved in improving information retrieval ...
expand
|
||
| Internet advertising: theory and practice | ||
| Bin Gao, Jun Yan, Dou Shen, Tie-Yan Liu | ||
| Pages: 1135-1135 | ||
| doi>10.1145/2484028.2484221 | ||
|
Full text: |
||
|
Internet advertising, a form of advertising that utilizes the Internet to deliver marketing messages and attract customers, has seen exponential growth since its inception around twenty years ago; it has been pivotal to the success of the World Wide ...
expand
|
||
| Exploration, navigation and retrieval of information in cultural heritage: ENRICH 2013 | ||
| Séamus Lawless, Maristella Agosti, Paul Clough, Owen Conlan | ||
| Pages: 1136-1136 | ||
| doi>10.1145/2484028.2491801 | ||
|
Full text: |
||
|
The Exploration, Navigation and Retrieval of Information in Cultural Heritage Workshop (ENRICH 2013) offers a forum to 1) discuss the challenges and opportunities in Information Retrieval research in the area of Cultural Heritage; 2) encourage collaboration ...
expand
|
||
| SIGIR 2013 workshop on time aware information access (#TAIA2013) | ||
| Fernando Diaz, Susan Dumais, Miles Efron, Kira Radinsky, Maarten de Rijke, Milad Shokouhi | ||
| Pages: 1137-1137 | ||
| doi>10.1145/2484028.2491802 | ||
|
Full text: |
||
|
Web content increasingly reflects the current state of the physical and social world, manifested both in traditional news media sources along with user-generated publishing sites such as Twitter, Foursquare, and Facebook. At the same time, web searching ...
expand
|
||
| Workshop on health search and discovery: helping users and advancing medicine | ||
| Ryen W. White, Elad Yom-Tov, Eric Horvitz, Eugene Agichtein, William Hersh | ||
| Pages: 1138-1138 | ||
| doi>10.1145/2484028.2484220 | ||
|
Full text: |
||
|
This workshop brings together researchers and practitioners from industry and academia to discuss search and discovery in the medi-cal domain. The event focuses on ways to make medical and health information more accessible to laypeople (including enhancements ...
expand
|
||
| EuroHCIR2013: the 3rd European workshop on human-computer interaction and information retrieval | ||
| Max L. Wilson, Birger Larsen, Preben Hansen, Kristian Norling, Tony Russell-Rose | ||
| Pages: 1139-1139 | ||
| doi>10.1145/2484028.2484223 | ||
|
Full text: |
||
|
A proposal summary for the EuroHCIR workshop at SIGIR2013.
expand
|
||
| SESSION: Doctoral consortium | ||
| Beyond relevance: on novelty and diversity in tag recommendation | ||
| Fabiano Belém | ||
| Pages: 1140-1140 | ||
| doi>10.1145/2484028.2484229 | ||
|
Full text: |
||
|
We propose to explicitly exploit issues related to novelty and diversity in tag recommendation tasks, an unexplored research avenue (only relevance issues have been investigated so far), in order to improve user experience and satisfaction. We propose ...
expand
|
||
| Group-support for task-based information searching: a knowledge-based approach | ||
| Thilo Boehm | ||
| Pages: 1141-1141 | ||
| doi>10.1145/2484028.2484235 | ||
|
Full text: |
||
| Diversified relevance feedback | ||
| Matt Crane | ||
| Pages: 1142-1142 | ||
| doi>10.1145/2484028.2484227 | ||
|
Full text: |
||
|
The need for a search engine to deal with ambiguous queries has been known for a long time (diversification). However, it is only recently that this need has become a focus within information retrieval research. How to respond to indications that a result ...
expand
|
||
| Segmentation strategies for passage retrieval in audio-visual documents | ||
| Petra Galuščáková | ||
| Pages: 1143-1143 | ||
| doi>10.1145/2484028.2484237 | ||
|
Full text: |
||
|
The importance of Information Retrieval (IR) in audio-visual recordings has been increasing with steeply growing numbers of audio-visual documents available on-line. Compared to traditional IR methods, this task requires specific techniques, such as ...
expand
|
||
| Indexing and querying overlapping structures | ||
| Faegheh Hasibi | ||
| Pages: 1144-1144 | ||
| doi>10.1145/2484028.2484234 | ||
|
Full text: |
||
|
Structural information retrieval is mostly based on hierarchy. However, in real life information is not purely hierarchical and structural elements may overlap each other. The most common example is a document with two distinct structural views, where ...
expand
|
||
| A query and patient understanding framework for medical records search | ||
| Nut Limsopatham | ||
| Pages: 1145-1145 | ||
| doi>10.1145/2484028.2484228 | ||
|
Full text: |
||
|
Electronic medical records (EMRs) are being increasingly used worldwide to facilitate improved healthcare services [2,3]. They describe the clinical decision process relating to a patient, detailing the observed symptoms, the conducted diagnostic tests, ...
expand
|
||
| Semantic models for answer re-ranking in question answering | ||
| Piero Molino | ||
| Pages: 1146-1146 | ||
| doi>10.1145/2484028.2484233 | ||
|
Full text: |
||
|
The task of Question Answering (QA) is to find correct answers to users' questions expressed in natural language. In the last few years non-factoid QA received more attention. It focuses on causation, manner and reason questions, where the expected answer ...
expand
|
||
| Task differentiation for personal search evaluation | ||
| Seyedeh Sargol Sadeghi | ||
| Pages: 1147-1147 | ||
| doi>10.1145/2484028.2484236 | ||
|
Full text: |
||
| The role of current working context in professional search | ||
| Maya Sappelli | ||
| Pages: 1148-1148 | ||
| doi>10.1145/2484028.2484231 | ||
|
Full text: |
||
|
Today's working world of knowledge workers is changing rapidly. The available information that they need to process is ever growing. In addition, the characteristics of their work are changing as people can and do their work from home. This has resulted ...
expand
|
||
| How far will you go?: characterizing and predicting online search stopping behavior using information scent and need for cognition | ||
| Wan-Ching Wu | ||
| Pages: 1149-1149 | ||
| doi>10.1145/2484028.2484232 | ||
|
Full text: |
||
| Effective approaches to retrieving and using expertise in social media | ||
| Reyyan Yeniterzi | ||
| Pages: 1150-1150 | ||
| doi>10.1145/2484028.2484230 | ||
|
Full text: |
||
|
Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and ...
expand
|
||
Welcome to SIGIR, the 36th annual international ACM conference on research and development in Information Retrieval. SIGIR is the premier, international venue for research and development in information retrieval. We believe the breadth and diversity of research that comprises the program reflects the health of the organization and major future directions of the field. We are grateful to all those who submitted papers to the conference and gave the Committee an opportunity to evaluate their work for potential inclusion in the program. We are also grateful to the 50 Area Chairs and 204 general program committee members, who represent 30 countries and over 120 institutions, for all the hard work they put into evaluating submissions.
The conference received 366 full paper submissions this year. Of these, 73 (20%) were accepted, essentially the same as last year's acceptance rate and the year before. The top five countries in terms of accepted papers (according to contact author affiliation) were the U.S.A. (28), China (9), the Netherlands, Singapore, and U.K. (5 each). The top five technical areas covered by the accepted papers (as indicated by the primary keyword assigned by paper authors) were users and interactive IR (16%), search engine architecture and scalability (15%), queries and query analysis (15%), evaluation (11%), and retrieval models and ranking (11%). This represents only a slight re-ordering of topics from last year. Two hundred fifty papers were submitted to the short papers track, which represents a 20% increase in the number of submissions made to last year's poster track. Eighty-five (34%) short papers were accepted. In addition, 46 demonstrations were proposed, of which 23 (50%) were accepted. The program also consisted of 7 workshops and 10 tutorials. Finally, the Doctoral Consortium hosted 11 students this year from 10 countries and 11 institutions.
As has been customary for many years, SIGIR 2013 used a two-tier double-blind review process. In the first stage, at least three reviewers read every paper and provided ratings and comments. Papers were evaluated according to seven main criteria: relevance, originality, soundness, quality of the presentation, impact, coverage of the literature, and, for the first time, reproducibility of the results. In the second stage, the primary and secondary Area Chairs ensured the quality of the reviewing process by studying, validating, and summarizing these reviews, and adding their own feedback and ratings. Area Chairs initiated discussions among reviewers to resolve any controversial issues or significant differences of opinion. Once the discussion stage was completed, the two Area Chairs made a recommendation regarding the paper for nearly all submissions. This year we allowed Area Chairs to indicate that a paper should be accepted if room. At the program committee meeting held in Amsterdam, The Netherlands, the Program Chairs and the attending Area Chairs went over the reviews, verified the process, gathered additional input, and discussed and decided on papers that were balloted as accept if room, papers from which the primary Area Chair abstained and papers that had unusual score distributions. For papers that were balloted as accept if room, we especially considered the potential for the paper to provoke interesting and fruitful discussion at the conference. Ultimately 73 papers were selected for inclusion in the program.
One important change to this year's program was renaming the poster paper submission type to short papers and increasing the length of the paper from two to four pages. Short papers were presented at the conference in poster format and two separate short paper sessions were included as part of the main conference program, rather than a single event collocated with an evening reception. We believe that increasing the length of the accompanying paper allows researchers to better communicate their experiments and results, which in turn, will allow this submission type to function as a more comprehensive and substantial container for small, but significant findings. We further believe this change better allows research presented in this format to get the attention it deserves. We would like to thank the Short Paper Co-Chairs for all the extra work they did this year managing this new format and the Short Paper reviewers for the great job they did handling both the larger volume of submissions and their increased size. We believe the large increase in number of submissions to this track indicates the community's receptiveness to this change.
We hope you find this program interesting, provocative and inspiring, and that the conference provides you with a valuable opportunity to share ideas with other researchers, practitioners and students from institutions around the world. The deadline for SIGIR 2014 is, after all, only six months away!
Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
|
||||||||||||||
| SESSION: Keynote address | ||
| Salton award lecture: information retrieval as engineering science | ||
| Norbert Fuhr | ||
| Pages: 1-2 | ||
| doi>10.1145/2348283.2348285 | ||
|
Full text: |
||
| Retrieving information from the book of humanity: the personalized medicine data tsunami crashes on the beach of jeopardy | ||
| Daniel R. Masys | ||
| Pages: 3-4 | ||
| doi>10.1145/2348283.2348286 | ||
|
Full text: |
||
|
From a mute but eloquent alphabet of 4 characters emerges a complex biological 'literature' whose highest expression is human existence. The rapidly advancing technologies of 'nextgen sequencing' will soon make it possible to inexpensively acquire and ...
expand
|
||
| SESSION: Query suggestion | ||
| Adaptation of the concept hierarchy model with search logs for query recommendation on intranets | ||
| Ibrahim Adepoju Adeyanju, Dawei Song, M-Dyaa Albakour, Udo Kruschwitz, Anne De Roeck, Maria Fasli | ||
| Pages: 5-14 | ||
| doi>10.1145/2348283.2348288 | ||
|
Full text: |
||
|
A concept hierarchy created from a document collection can be used for query recommendation on Intranets by ranking terms according to the strength of their links to the query within the hierarchy. A major limitation is that this model produces the same ...
expand
|
||
| Adaptive query suggestion for difficult queries | ||
| Yang Liu, Ruihua Song, Yu Chen, Jian-Yun Nie, Ji-Rong Wen | ||
| Pages: 15-24 | ||
| doi>10.1145/2348283.2348289 | ||
|
Full text: |
||
|
Query suggestion is a useful tool to help users formulate better queries. Although this has been found highly useful globally, its effect on different queries may vary. In this paper, we examine the impact of query suggestion on queries of different ...
expand
|
||
| Learning to suggest: a machine learning framework for ranking query suggestions | ||
| Umut Ozertem, Olivier Chapelle, Pinar Donmez, Emre Velipasaoglu | ||
| Pages: 25-34 | ||
| doi>10.1145/2348283.2348290 | ||
|
Full text: |
||
|
We consider the task of suggesting related queries to users after they issue their initial query to a web search engine. We propose a machine learning approach to learn the probability that a user may find a follow-up query both useful and relevant, ...
expand
|
||
| SESSION: Multimedia 1 | ||
| Privacy-aware image classification and search | ||
| Sergej Zerr, Stefan Siersdorfer, Jonathon Hare, Elena Demidova | ||
| Pages: 35-44 | ||
| doi>10.1145/2348283.2348292 | ||
|
Full text: |
||
|
Modern content sharing environments such as Flickr or YouTube contain a large amount of private resources such as photos showing weddings, family holidays, and private parties. These resources can be of a highly sensitive nature, disclosing many details ...
expand
|
||
| Manhattan hashing for large-scale image retrieval | ||
| Weihao Kong, Wu-Jun Li, Minyi Guo | ||
| Pages: 45-54 | ||
| doi>10.1145/2348283.2348293 | ||
|
Full text: |
||
|
Hashing is used to learn binary-code representation for data with expectation of preserving the neighborhood structure in the original feature space. Due to its fast query speed and reduced storage cost, hashing has been widely used for efficient nearest ...
expand
|
||
| Boosting multi-kernel locality-sensitive hashing for scalable image retrieval | ||
| Hao Xia, Pengcheng Wu, Steven C.H. Hoi, Rong Jin | ||
| Pages: 55-64 | ||
| doi>10.1145/2348283.2348294 | ||
|
Full text: |
||
|
Similarity search is a key challenge for multimedia retrieval applications where data are usually represented in high-dimensional space. Among various algorithms proposed for similarity search in high-dimensional space, Locality-Sensitive Hashing (LSH) ...
expand
|
||
| SESSION: Diversity 1 | ||
| Diversity by proportionality: an election-based approach to search result diversification | ||
| Van Dang, W. Bruce Croft | ||
| Pages: 65-74 | ||
| doi>10.1145/2348283.2348296 | ||
|
Full text: |
||
|
This paper presents a different perspective on diversity in search results: diversity by proportionality. We consider a result list most diverse, with respect to some set of topics related to the query, when the number of documents it provides on each ...
expand
|
||
| Explicit relevance models in intent-oriented information retrieval diversification | ||
| Saúl Vargas, Pablo Castells, David Vallet | ||
| Pages: 75-84 | ||
| doi>10.1145/2348283.2348297 | ||
|
Full text: |
||
|
The intent-oriented search diversification methods developed in the field so far tend to build on generative views of the retrieval system to be diversified. Core algorithm components in particular redundancy assessment are expressed in terms of the ...
expand
|
||
| AspecTiles: tile-based visualization of diversified web search results | ||
| Mayu Iwata, Tetsuya Sakai, Takehiro Yamamoto, Yu Chen, Yi Liu, Ji-Rong Wen, Shojiro Nishio | ||
| Pages: 85-94 | ||
| doi>10.1145/2348283.2348298 | ||
|
Full text: |
||
|
A diversified search result for an underspecified query generally contains web pages in which there are answers that are relevant to different aspects of the query. In order to help the user locate such relevant answers, we propose a simple extension ...
expand
|
||
| SESSION: Evaluation 1 | ||
| Time-based calibration of effectiveness measures | ||
| Mark D. Smucker, Charles L.A. Clarke | ||
| Pages: 95-104 | ||
| doi>10.1145/2348283.2348300 | ||
|
Full text: |
||
|
Many current effectiveness measures incorporate simplifying assumptions about user behavior. These assumptions prevent the measures from reflecting aspects of the search process that directly impact the quality of retrieval results as experienced by ...
expand
|
||
| Time drives interaction: simulating sessions in diverse searching environments | ||
| Feza Baskaya, Heikki Keskustalo, Kalervo Järvelin | ||
| Pages: 105-114 | ||
| doi>10.1145/2348283.2348301 | ||
|
Full text: |
||
|
Real life information retrieval takes place in sessions, where users search by iterating between various cognitive, perceptual and motor subtasks through an interactive interface. The sessions may follow diverse strategies, which, together with the interface ...
expand
|
||
| Evaluating aggregated search pages | ||
| Ke Zhou, Ronan Cummins, Mounia Lalmas, Joemon M. Jose | ||
| Pages: 115-124 | ||
| doi>10.1145/2348283.2348302 | ||
|
Full text: |
||
|
Aggregating search results from a variety of heterogeneous sources or verticals such as news, image and video into a single interface is a popular paradigm in web search. Although various approaches exist for selecting relevant verticals or optimising ...
expand
|
||
| SESSION: Structured data | ||
| Combining inverted indices and structured search for ad-hoc object retrieval | ||
| Alberto Tonon, Gianluca Demartini, Philippe Cudré-Mauroux | ||
| Pages: 125-134 | ||
| doi>10.1145/2348283.2348304 | ||
|
Full text: |
||
|
Retrieving semi-structured entities to answer keyword queries is an increasingly important feature of many modern Web applications. The fast-growing Linked Open Data (LOD) movement makes it possible to crawl and index very large amounts of structured ...
expand
|
||
| Retrieving similar discussion forum threads: a structure based approach | ||
| Amit Singh, Deepak P, Dinesh Raghu | ||
| Pages: 135-144 | ||
| doi>10.1145/2348283.2348305 | ||
|
Full text: |
||
|
Online forums are becoming a popular way of finding useful information on the web. Search over forums for existing discussion threads so far is limited to keyword-based search due to the minimal effort required on part of the users. However, it is often ...
expand
|
||
| Summarizing highly structured documents for effective search interaction | ||
| Lanbo Zhang, Yi Zhang, Yunfei Chen | ||
| Pages: 145-154 | ||
| doi>10.1145/2348283.2348306 | ||
|
Full text: |
||
|
As highly structured documents with rich metadata (such as products, movies, etc.) become increasingly prevalent, searching those documents has become an important IR problem. Unfortunately existing work on document summarization, especially in the context ...
expand
|
||
| SESSION: Recommender systems 1 | ||
| TFMAP: optimizing MAP for top-n context-aware recommendation | ||
| Yue Shi, Alexandros Karatzoglou, Linas Baltrunas, Martha Larson, Alan Hanjalic, Nuria Oliver | ||
| Pages: 155-164 | ||
| doi>10.1145/2348283.2348308 | ||
|
Full text: |
||
|
In this paper, we tackle the problem of top-N context-aware recommendation for implicit feedback scenarios. We frame this challenge as a ranking problem in collaborative filtering (CF). Much of the past work on CF has not focused on evaluation metrics ...
expand
|
||
| Increasing temporal diversity with purchase intervals | ||
| Gang Zhao, Mong Li Lee, Wynne Hsu, Wei Chen | ||
| Pages: 165-174 | ||
| doi>10.1145/2348283.2348309 | ||
|
Full text: |
||
|
The development of Web 2.0 technology has led to huge economic benefits and challenges for both e-commerce websites and online shoppers. One core technology to increase sales and consumers' satisfaction is the use of recommender systems. Existing product ...
expand
|
||
| Adaptive diversification of recommendation results via latent factor portfolio | ||
| Yue Shi, Xiaoxue Zhao, Jun Wang, Martha Larson, Alan Hanjalic | ||
| Pages: 175-184 | ||
| doi>10.1145/2348283.2348310 | ||
|
Full text: |
||
|
This paper studies result diversification in collaborative filtering. We argue that the diversification level in a recommendation list should be adapted to the target users' individual situations and needs. Different users may have different ranges of ...
expand
|
||
| SESSION: Users 1: personalization and user modeling | ||
| Modeling the impact of short- and long-term behavior on search personalization | ||
| Paul N. Bennett, Ryen W. White, Wei Chu, Susan T. Dumais, Peter Bailey, Fedor Borisyuk, Xiaoyuan Cui | ||
| Pages: 185-194 | ||
| doi>10.1145/2348283.2348312 | ||
|
Full text: |
||
|
User behavior provides many cues to improve the relevance of search results through personalization. One aspect of user behavior that provides especially strong signals for delivering better relevance is an individual's history of queries and clicked ...
expand
|
||
| Improving searcher models using mouse cursor activity | ||
| Jeff Huang, Ryen W. White, Georg Buscher, Kuansan Wang | ||
| Pages: 195-204 | ||
| doi>10.1145/2348283.2348313 | ||
|
Full text: |
||
|
Web search components such as ranking and query suggestions analyze the user data provided in query and click logs. While this data is easy to collect and provides information about user behavior, it omits user interactions with the search engine that ...
expand
|
||
| Personalization of search results using interaction behaviors in search sessions | ||
| Chang Liu, Nicholas J. Belkin, Michael J. Cole | ||
| Pages: 205-214 | ||
| doi>10.1145/2348283.2348314 | ||
|
Full text: |
||
|
Personalization of search results offers the potential for significant improvement in information retrieval performance. User interactions with the system and documents during information-seeking sessions provide a wealth of information about user preferences ...
expand
|
||
| User evaluation of query quality | ||
| Wan-Ching Wu, Diane Kelly, Kun Huang | ||
| Pages: 215-224 | ||
| doi>10.1145/2348283.2348315 | ||
|
Full text: |
||
|
Although a great deal of research has been conducted about automatic techniques for determining query quality, there have been relatively few studies about how people judge query quality. This study investigated this topic through a laboratory experiment ...
expand
|
||
| SESSION: Architectures 1 | ||
| Efficient in-memory top-k document retrieval | ||
| J. Shane Culpepper, Matthias Petri, Falk Scholer | ||
| Pages: 225-234 | ||
| doi>10.1145/2348283.2348317 | ||
|
Full text: |
||
|
For over forty years the dominant data structure for ranked document retrieval has been the inverted index. Inverted indexes are effective for a variety of document retrieval tasks, and particularly efficient for large data collection scenarios that ...
expand
|
||
| Index maintenance for time-travel text search | ||
| Avishek Anand, Srikanta Bedathur, Klaus Berberich, Ralf Schenkel | ||
| Pages: 235-244 | ||
| doi>10.1145/2348283.2348318 | ||
|
Full text: |
||
|
Time-travel text search enriches standard text search by temporal predicates, so that users of web archives can easily retrieve document versions that are considered relevant to a given keyword query and existed during a given time interval. Different ...
expand
|
||
| Optimizing positional index structures for versioned document collections | ||
| JInru He, Torsten Suel | ||
| Pages: 245-254 | ||
| doi>10.1145/2348283.2348319 | ||
|
Full text: |
||
|
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and documents maintained in revision control systems. Versioned document collections ...
expand
|
||
| To index or not to index: time-space trade-offs in search engines with positional ranking functions | ||
| Diego Arroyuelo, Senén González, Mauricio Marin, Mauricio Oyarzún, Torsten Suel | ||
| Pages: 255-264 | ||
| doi>10.1145/2348283.2348320 | ||
|
Full text: |
||
|
Positional ranking functions, widely used in Web search engines, improve result quality by exploiting the positions of the query terms within documents. However, it is well known that positional indexes demand large amounts of extra space, typically ...
expand
|
||
| SESSION: Search log analysis | ||
| Studies of the onset and persistence of medical concerns in search logs | ||
| Ryen W. White, Eric Horvitz | ||
| Pages: 265-274 | ||
| doi>10.1145/2348283.2348322 | ||
|
Full text: |
||
|
The Web provides a wealth of information about medical symptoms and disorders. Although this content is often valuable to consumers, studies have found that interaction with Web content may heighten anxiety and stimulate healthcare utilization. We present ...
expand
|
||
| A semi-supervised approach to modeling web search satisfaction | ||
| Ahmed Hassan | ||
| Pages: 275-284 | ||
| doi>10.1145/2348283.2348323 | ||
|
Full text: |
||
|
Web search is an interactive process that involves actions from Web search users and responses from the search engine. Many research efforts have been made to address the problem of understanding search behavior in general. Some of this work focused ...
expand
|
||
| Social annotations: utility and prediction modeling | ||
| Patrick Pantel, Michael Gamon, Omar Alonso, Kevin Haas | ||
| Pages: 285-294 | ||
| doi>10.1145/2348283.2348324 | ||
|
Full text: |
||
|
Social features are increasingly integrated within the search results page of the main commercial search engines. There is, however, little understanding of the utility of social features in traditional search. In this paper, we study utility in the ...
expand
|
||
| An exploration of ranking heuristics in mobile local search | ||
| Yuanhua Lv, Dimitrios Lymberopoulos, Qiang Wu | ||
| Pages: 295-304 | ||
| doi>10.1145/2348283.2348325 | ||
|
Full text: |
||
|
Users increasingly rely on their mobile devices to search local entities, typically businesses, while on the go. Even though recent work has recognized that the ranking signals in mobile local search (e.g., distance and customer rating score of a business) ...
expand
|
||
| SESSION: User intent | ||
| Mining query subtopics from search log data | ||
| Yunhua Hu, Yanan Qian, Hang Li, Daxin Jiang, Jian Pei, Qinghua Zheng | ||
| Pages: 305-314 | ||
| doi>10.1145/2348283.2348327 | ||
|
Full text: |
||
|
Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, ...
expand
|
||
| Search, interrupted: understanding and predicting search task continuation | ||
| Eugene Agichtein, Ryen W. White, Susan T. Dumais, Paul N. Bennet | ||
| Pages: 315-324 | ||
| doi>10.1145/2348283.2348328 | ||
|
Full text: |
||
|
Many important search tasks require multiple search sessions to complete. Tasks such as travel planning, large purchases, or job searches can span hours, days, or even weeks. Inevitably, life interferes, requiring the searcher either to recover the "state" ...
expand
|
||
| Multi-aspect query summarization by composite query | ||
| Wei Song, Qing Yu, Zhiheng Xu, Ting Liu, Sheng Li, Ji-Rong Wen | ||
| Pages: 325-334 | ||
| doi>10.1145/2348283.2348329 | ||
|
Full text: |
||
|
Conventional search engines usually return a ranked list of web pages in response to a query. Users have to visit several pages to locate the relevant parts. A promising future search scenario should involve: (1) understanding user intents; (2) providing ...
expand
|
||
| Language intent models for inferring user browsing behavior | ||
| Manos Tsagkias, Roi Blanco | ||
| Pages: 335-344 | ||
| doi>10.1145/2348283.2348330 | ||
|
Full text: |
||
|
Modeling user browsing behavior is an active research area with tangible real-world applications, e.g., organizations can adapt their online presence to their visitors browsing behavior with positive effects in user engagement, and revenue. We concentrate ...
expand
|
||
| SESSION: Efficiency | ||
| Efficient query recommendations in the long tail via center-piece subgraphs | ||
| Francesco Bonchi, Raffaele Perego, Fabrizio Silvestri, Hossein Vahabi, Rossano Venturini | ||
| Pages: 345-354 | ||
| doi>10.1145/2348283.2348332 | ||
|
Full text: |
||
|
We present a recommendation method based on the well-known concept of center-piece subgraph, that allows for the time/space efficient generation of suggestions also for rare, i.e., long-tail queries. Our method is scalable with respect to both the size ...
expand
|
||
| Supporting efficient top-k queries in type-ahead search | ||
| Guoliang Li, Jiannan Wang, Chen Li, Jianhua Feng | ||
| Pages: 355-364 | ||
| doi>10.1145/2348283.2348333 | ||
|
Full text: |
||
|
Type-ahead search can on-the-fly find answers as a user types in a keyword query. A main challenge in this search paradigm is the high-efficiency requirement that queries must be answered within milliseconds. In this paper we study how to answer top-k ...
expand
|
||
| SimFusion+: extending simfusion towards efficient estimation on large and dynamic networks | ||
| Weiren Yu, Xuemin Lin, Wenjie Zhang, Ying Zhang, Jiajin Le | ||
| Pages: 365-374 | ||
| doi>10.1145/2348283.2348334 | ||
|
Full text: |
||
|
SimFusion has become a captivating measure of similarity between objects in a web graph. It is iteratively distilled from the notion that "the similarity between two objects is reinforced by the similarity of their related objects". The existing SimFusion ...
expand
|
||
| Group matrix factorization for scalable topic modeling | ||
| Quan Wang, Zheng Cao, Jun Xu, Hang Li | ||
| Pages: 375-384 | ||
| doi>10.1145/2348283.2348335 | ||
|
Full text: |
||
|
Topic modeling can reveal the latent structure of text data and is useful for knowledge discovery, search relevance ranking, document classification, and so on. One of the major challenges in topic modeling is to deal with large datasets and large numbers ...
expand
|
||
| SESSION: Spam and abuse | ||
| Detecting quilted web pages at scale | ||
| Marc Najork | ||
| Pages: 385-394 | ||
| doi>10.1145/2348283.2348337 | ||
|
Full text: |
||
|
Web-based advertising and electronic commerce, combined with the key role of search engines in driving visitors to ad-monetized and e-commerce web sites, has given rise to the phenomenon of web spam: web pages that are of little value to visitors, but ...
expand
|
||
| Fighting against web spam: a novel propagation method based on click-through data | ||
| Chao Wei, Yiqun Liu, Min Zhang, Shaoping Ma, Liyun Ru, Kuo Zhang | ||
| Pages: 395-404 | ||
| doi>10.1145/2348283.2348338 | ||
|
Full text: |
||
|
Combating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as content spamming and link-based spamming. Although these anti-spam approaches ...
expand
|
||
| Learning hash codes for efficient content reuse detection | ||
| Qi Zhang, Yan Wu, Zhuoye Ding, Xuanjing Huang | ||
| Pages: 405-414 | ||
| doi>10.1145/2348283.2348339 | ||
|
Full text: |
||
|
Content reuse is extremely common in user generated mediums. Reuse detection serves as be the basis for many applications. However, along with the explosion of Internet and continuously growing uses of user generated mediums, the task becomes more critical ...
expand
|
||
| SESSION: Users 2: exploratory search | ||
| Explanatory semantic relatedness and explicit spatialization for exploratory search | ||
| Brent Hecht, Samuel H. Carton, Mahmood Quaderi, Johannes Schöning, Martin Raubal, Darren Gergle, Doug Downey | ||
| Pages: 415-424 | ||
| doi>10.1145/2348283.2348341 | ||
|
Full text: |
||
|
Exploratory search, in which a user investigates complex concepts, is cumbersome with today's search engines. We present a new exploratory search approach that generates interactive visualizations of query concepts using thematic cartography (e.g. choropleth ...
expand
|
||
| A subjunctive exploratory search interface to support media studies researchers | ||
| Marc Bron, Jasmijn van Gorp, Frank Nack, Maarten de Rijke, Andrei Vishneuski, Sonja de Leeuw | ||
| Pages: 425-434 | ||
| doi>10.1145/2348283.2348342 | ||
|
Full text: |
||
|
Media studies concerns the study of production, content, and/or reception of various types of media. Today's continuous production and storage of media is changing the way media studies researchers work and requires the development of new search models ...
expand
|
||
| Task complexity, vertical display and user interaction in aggregated search | ||
| Jaime Arguello, Wan-Ching Wu, Diane Kelly, Ashlee Edwards | ||
| Pages: 435-444 | ||
| doi>10.1145/2348283.2348343 | ||
|
Full text: |
||
|
Aggregated search is the task of blending results from specialized search services or verticals into the Web search results. While many studies have focused on aggregated search techniques, few studies have tried to better understand how users interact ...
expand
|
||
| SESSION: Multimedia 2 | ||
| Image ranking based on user browsing behavior | ||
| Michele Trevisiol, Luca Chiarandini, Luca Maria Aiello, Alejandro Jaimes | ||
| Pages: 445-454 | ||
| doi>10.1145/2348283.2348345 | ||
|
Full text: |
||
|
Ranking of images is difficult because many factors determine their importance (e.g., popularity, quality, entertainment value, context, etc.). In social media platforms, ranking also depends on social interactions and on the visibility of the images ...
expand
|
||
| Modeling concept dynamics for large scale music search | ||
| Jialie Shen, HweeHwa Pang, Meng Wang, Shuicheng Yan | ||
| Pages: 455-464 | ||
| doi>10.1145/2348283.2348346 | ||
|
Full text: |
||
|
Continuing advances in data storage and communication technologies have led to an explosive growth in digital music collections. To cope with their increasing scale, we need effective Music Information Retrieval (MIR) capabilities like tagging, concept ...
expand
|
||
| Finding translations in scanned book collections | ||
| Ismet Zeki Yalniz, R. Manmatha | ||
| Pages: 465-474 | ||
| doi>10.1145/2348283.2348347 | ||
|
Full text: |
||
|
This paper describes an approach for identifying translations of books in large scanned book collections with OCR errors. The method is based on the idea that although individual sentences do not necessarily preserve the word order when translated, a ...
expand
|
||
| SESSION: Recommender systems 2 | ||
| Predicting the ratings of multimedia items for making personalized recommendations | ||
| Rani Qumsiyeh, Yiu-Kai Ng | ||
| Pages: 475-484 | ||
| doi>10.1145/2348283.2348349 | ||
|
Full text: |
||
|
Existing multimedia recommenders suggest a specific type of multimedia items rather than items of different types personalized for a user based on his/her preference. Assume that a user is interested in a particular family movie, it is appealing if a ...
expand
|
||
| Personalized click shaping through lagrangian duality for online recommendation | ||
| Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, Xuanhui Wang | ||
| Pages: 485-494 | ||
| doi>10.1145/2348283.2348350 | ||
|
Full text: |
||
|
Online content recommendation aims to identify trendy articles in a continuously changing dynamic content pool. Most of existing works rely on online user feedback, notably clicks, as the objective and maximize it by showing articles with highest click-through ...
expand
|
||
| What reviews are satisfactory: novel features for automatic helpfulness voting | ||
| Yu Hong, Jun Lu, Jianmin Yao, Qiaoming Zhu, Guodong Zhou | ||
| Pages: 495-504 | ||
| doi>10.1145/2348283.2348351 | ||
|
Full text: |
||
|
This paper focuses on exploring the features of product reviews that satisfy users, by which to improve the automatic helpfulness voting for the reviews on commercial websites. Compared to the previous work, which single-mindedly adopts the textual features ...
expand
|
||
| SESSION: Query expansion and reformulation | ||
| Automatic refinement of patent queries using concept importance predictors | ||
| Parvaz Mahdabi, Linda Andersson, Mostafa Keikha, Fabio Crestani | ||
| Pages: 505-514 | ||
| doi>10.1145/2348283.2348353 | ||
|
Full text: |
||
|
Patent prior art queries are full patent applications which are much longer than standard web search topics. Such queries are composed of hundreds of terms and do not represent a focused information need. One way to make the queries more focused is to ...
expand
|
||
| Automatic term mismatch diagnosis for selective query expansion | ||
| Le Zhao, Jamie Callan | ||
| Pages: 515-524 | ||
| doi>10.1145/2348283.2348354 | ||
|
Full text: |
||
|
People are seldom aware that their search queries frequently mismatch a majority of the relevant documents. This may not be a big problem for topics with a large and diverse set of relevant documents, but would largely increase the chance of search failure ...
expand
|
||
| Generating reformulation trees for complex queries | ||
| Xiaobing Xue, W. Bruce Croft | ||
| Pages: 525-534 | ||
| doi>10.1145/2348283.2348355 | ||
|
Full text: |
||
|
Search queries have evolved beyond keyword queries. Many complex queries such as verbose queries, natural language question queries and document-based queries are widely used in a variety of applications. Processing these complex queries usually requires ...
expand
|
||
| Proximity-based rocchio's model for pseudo relevance | ||
| Jun Miao, Jimmy Xiangji Huang, Zheng Ye | ||
| Pages: 535-544 | ||
| doi>10.1145/2348283.2348356 | ||
|
Full text: |
||
|
Rocchio's relevance feedback model is a classic query expansion method and it has been shown to be effective in boosting information retrieval performance. The selection of expansion terms in this method, however, does not take into account the relationship ...
expand
|
||
| SESSION: Social media 1 | ||
| Modeling user posting behavior on social media | ||
| Zhiheng Xu, Yang Zhang, Yao Wu, Qing Yang | ||
| Pages: 545-554 | ||
| doi>10.1145/2348283.2348358 | ||
|
Full text: |
||
|
User generated content is the basic element of social media websites. Relatively few studies have systematically analyzed the motivation to create and share content, especially from the perspective of a common user. In this paper, we perform a comprehensive ...
expand
|
||
| Friend or frenemy?: predicting signed ties in social networks | ||
| Shuang-Hong Yang, Alexander J. Smola, Bo Long, Hongyuan Zha, Yi Chang | ||
| Pages: 555-564 | ||
| doi>10.1145/2348283.2348359 | ||
|
Full text: |
||
|
We study the problem of labeling the edges of a social network graph (e.g., acquaintance connections in Facebook) as either positive (i.e., trust, true friendship) or negative (i.e., distrust, possible frenemy) relations. Such signed relations provide ...
expand
|
||
| Social-network analysis using topic models | ||
| Youngchul Cha, Junghoo Cho | ||
| Pages: 565-574 | ||
| doi>10.1145/2348283.2348360 | ||
|
Full text: |
||
|
In this paper, we discuss how we can extend probabilistic topic models to analyze the relationship graph of popular social-network data, so that we can group or label the edges and nodes in the graph based on their topic similarity. In particular, we ...
expand
|
||
| Cognos: crowdsourcing search for topic experts in microblogs | ||
| Saptarshi Ghosh, Naveen Sharma, Fabricio Benevenuto, Niloy Ganguly, Krishna Gummadi | ||
| Pages: 575-590 | ||
| doi>10.1145/2348283.2348361 | ||
|
Full text: |
||
|
Finding topic experts on microblogging sites with millions of users, such as Twitter, is a hard and challenging problem. In this paper, we propose and investigate a new methodology for discovering topic experts in the popular Twitter social network. ...
expand
|
||
| SESSION: Query completion and correction | ||
| Automatic suggestion of query-rewrite rules for enterprise search | ||
| Zhuowei Bao, Benny Kimelfeld, Yunyao Li | ||
| Pages: 591-600 | ||
| doi>10.1145/2348283.2348363 | ||
|
Full text: |
||
|
Enterprise search is challenging for several reasons, notably the dynamic terminology and jargon that are specific to the enterprise domain. This challenge is partly addressed by having domain experts maintaining the enterprise search engine and adapting ...
expand
|
||
| Time-sensitive query auto-completion | ||
| Milad Shokouhi, Kira Radinsky | ||
| Pages: 601-610 | ||
| doi>10.1145/2348283.2348364 | ||
|
Full text: |
||
|
Query auto-completion (QAC) is a common feature in modern search engines. High quality QAC candidates enhance search experience by saving users time that otherwise would be spent on typing each character or word sequentially. Current QAC methods rank ...
expand
|
||
| A generalized hidden Markov model with discriminative training for query spelling correction | ||
| Yanen Li, Huizhong Duan, ChengXiang Zhai | ||
| Pages: 611-620 | ||
| doi>10.1145/2348283.2348365 | ||
|
Full text: |
||
|
Query spelling correction is a crucial component of modern search engines. Existing methods in the literature for search query spelling correction have two major drawbacks. First, they are unable to handle certain important types of spelling errors, ...
expand
|
||
| SESSION: Architectures 2 | ||
| Learning to predict response times for online query scheduling | ||
| Craig Macdonald, Nicola Tonellotto, Iadh Ounis | ||
| Pages: 621-630 | ||
| doi>10.1145/2348283.2348367 | ||
|
Full text: |
||
|
Dynamic pruning strategies permit efficient retrieval by not fully scoring all postings of the documents matching a query -- without degrading the retrieval effectiveness of the top-ranked results. However, the amount of pruning achievable for a query ...
expand
|
||
| Prefetching query results and its impact on search engines | ||
| Simon Jonassen, B. Barla Cambazoglu, Fabrizio Silvestri | ||
| Pages: 631-640 | ||
| doi>10.1145/2348283.2348368 | ||
|
Full text: |
||
|
We investigate the impact of query result prefetching on the efficiency and effectiveness of web search engines. We propose offline and online strategies for selecting and ordering queries whose results are to be prefetched. The offline strategies rely ...
expand
|
||
| Online result cache invalidation for real-time web search | ||
| Xiao Bai, Flavio P. Junqueira | ||
| Pages: 641-650 | ||
| doi>10.1145/2348283.2348369 | ||
|
Full text: |
||
|
Caches of results are critical components of modern Web search engines, since they enable lower response time to frequent queries and reduce the load to the search engine backend. Results in long-lived cache entries may become stale, however, as search ...
expand
|
||
| SESSION: Recommender systems 3 | ||
| Learning to rank social update streams | ||
| Liangjie Hong, Ron Bekkerman, Joseph Adler, Brian D. Davison | ||
| Pages: 651-660 | ||
| doi>10.1145/2348283.2348371 | ||
|
Full text: |
||
|
As online social media further integrates deeper into our lives, we spend more time consuming social update streams that come from our online connections. Although social update streams provide a tremendous opportunity for us to access information on-the-fly, ...
expand
|
||
| Collaborative personalized tweet recommendation | ||
| Kailong Chen, Tianqi Chen, Guoqing Zheng, Ou Jin, Enpeng Yao, Yong Yu | ||
| Pages: 661-670 | ||
| doi>10.1145/2348283.2348372 | ||
|
Full text: |
||
|
Twitter has rapidly grown to a popular social network in recent years and provides a large number of real-time messages for users. Tweets are presented in chronological order and users scan the followees' timelines to find what they are interested in. ...
expand
|
||
| Exploring social influence for recommendation: a generative model approach | ||
| Mao Ye, Xingjie Liu, Wang-Chien Lee | ||
| Pages: 671-680 | ||
| doi>10.1145/2348283.2348373 | ||
|
Full text: |
||
|
Social friendship has been shown beneficial for item recommendation for years. However, existing approaches mostly incorporate social friendship into recommender systems by heuristics. In this paper, we argue that social influence between ...
expand
|
||
| SESSION: Multimedia 3 | ||
| See-to-retrieve: efficient processing of spatio-visual keyword queries | ||
| Chao Zhang, Lidan Shou, Ke Chen, Gang Chen | ||
| Pages: 681-690 | ||
| doi>10.1145/2348283.2348375 | ||
|
Full text: |
||
|
The wide proliferation of powerful smart phones equipped with multiple sensors, 3D graphical engine, and 3G connection has nurtured the creation of a new spectrum of visual mobile applications. These applications require novel data retrieval techniques ...
expand
|
||
| Placing images on the world map: a microblog-based enrichment approach | ||
| Claudia Hauff, Geert-Jan Houben | ||
| Pages: 691-700 | ||
| doi>10.1145/2348283.2348376 | ||
|
Full text: |
||
|
Estimating the geographic location of images is a task which has received increasing attention recently. Large numbers of images uploaded to platforms such as Flickr do not contain GPS-based latitude/longitude coordinates. Obtaining such geographic information ...
expand
|
||
| Where is who: large-scale photo retrieval by facial attributes and canvas layout | ||
| Yu-Heng Lei, Yan-Ying Chen, Bor-Chun Chen, Lime Iida, Winston H. Hsu | ||
| Pages: 701-710 | ||
| doi>10.1145/2348283.2348377 | ||
|
Full text: |
||
|
The ubiquitous availability of digital cameras has made it easier than ever to capture moments of life, especially the ones accompanied with friends and family. It is generally believed that most family photos are with faces that are sparsely tagged. ...
expand
|
||
| SESSION: Entities | ||
| Mining the web for points of interest | ||
| Adam Rae, Vanessa Murdock, Adrian Popescu, Hugues Bouchard | ||
| Pages: 711-720 | ||
| doi>10.1145/2348283.2348379 | ||
|
Full text: |
||
|
A point of interest (POI) is a focused geographic entity such as a landmark, a school, an historical building, or a business. Points of interest are the basis for most of the data supporting location-based applications. In this paper we propose ...
expand
|
||
| TwiNER: named entity recognition in targeted twitter stream | ||
| Chenliang Li, Jianshu Weng, Qi He, Yuxia Yao, Anwitaman Datta, Aixin Sun, Bu-Sung Lee | ||
| Pages: 721-730 | ||
| doi>10.1145/2348283.2348380 | ||
|
Full text: |
||
|
Many private and/or public organizations have been reported to create and monitor targeted Twitter streams to collect and understand users' opinions about the organizations. Targeted Twitter stream is usually constructed by filtering tweets ...
expand
|
||
| Adaptive context features for toponym resolution in streaming news | ||
| Michael D. Lieberman, Hanan Samet | ||
| Pages: 731-740 | ||
| doi>10.1145/2348283.2348381 | ||
|
Full text: |
||
|
News sources around the world generate constant streams of information, but effective streaming news retrieval requires an intimate understanding of the geographic content of news. This process of understanding, known as geotagging, consists of first ...
expand
|
||
| SESSION: Learning to rank | ||
| Structural relationships for large-scale learning of answer re-ranking | ||
| Aliaksei Severyn, Alessandro Moschitti | ||
| Pages: 741-750 | ||
| doi>10.1145/2348283.2348383 | ||
|
Full text: |
||
|
Supervised learning applied to answer re-ranking can highly improve on the overall accuracy of question answering (QA) systems. The key aspect is that the relationships and properties of the question/answer pair composed of a question and the supporting ...
expand
|
||
| Top-k learning to rank: labeling, ranking and evaluation | ||
| Shuzi Niu, Jiafeng Guo, Yanyan Lan, Xueqi Cheng | ||
| Pages: 751-760 | ||
| doi>10.1145/2348283.2348384 | ||
|
Full text: |
||
|
In this paper, we propose a novel top-k learning to rank framework, which involves labeling strategy, ranking model and evaluation measure. The motivation comes from the difficulty in obtaining reliable relevance judgments from human assessors when applying ...
expand
|
||
| Robust ranking models via risk-sensitive optimization | ||
| Lidan Wang, Paul N. Bennett, Kevyn Collins-Thompson | ||
| Pages: 761-770 | ||
| doi>10.1145/2348283.2348385 | ||
|
Full text: |
||
|
Many techniques for improving search result quality have been proposed. Typically, these techniques increase average effectiveness by devising advanced ranking features and/or by developing sophisticated learning to rank algorithms. However, while these ...
expand
|
||
| SESSION: Community QA | ||
| Dual role model for question recommendation in community question answering | ||
| Fei Xu, Zongcheng Ji, Bin Wang | ||
| Pages: 771-780 | ||
| doi>10.1145/2348283.2348387 | ||
|
Full text: |
||
|
Question recommendation that automatically recommends a new question to suitable users to answer is an appealing and challenging problem in the research area of Community Question Answering (CQA). Unlike in general recommender systems where a user has ...
expand
|
||
| Vote calibration in community question-answering systems | ||
| Bee-Chung Chen, Anirban Dasgupta, Xuanhui Wang, Jie Yang | ||
| Pages: 781-790 | ||
| doi>10.1145/2348283.2348388 | ||
|
Full text: |
||
|
User votes are important signals in community question-answering (CQA) systems. Many features of typical CQA systems, e.g. the best answer to a question, status of a user, are dependent on ratings or votes cast by the community. In a popular CQA site, ...
expand
|
||
| Category hierarchy maintenance: a data-driven approach | ||
| Quan Yuan, Gao Cong, Aixin Sun, Chin-Yew Lin, Nadia Magnenat Thalmann | ||
| Pages: 791-800 | ||
| doi>10.1145/2348283.2348389 | ||
|
Full text: |
||
|
Category hierarchies often evolve at a much slower pace than the documents reside in. With newly available documents kept adding into a hierarchy, new topics emerge and documents within the same category become less topically cohesive. In this paper, ...
expand
|
||
| When web search fails, searchers become askers: understanding the transition | ||
| Qiaoling Liu, Eugene Agichtein, Gideon Dror, Yoelle Maarek, Idan Szpektor | ||
| Pages: 801-810 | ||
| doi>10.1145/2348283.2348390 | ||
|
Full text: |
||
|
While Web search has become increasingly effective over the last decade, for many users' needs the required answers may be spread across many documents, or may not exist on the Web at all. Yet, many of these needs could be addressed by asking people ...
expand
|
||
| SESSION: Federated search | ||
| Content-based retrieval for heterogeneous domains: domain adaptation by relative aggregation points | ||
| Makoto P. Kato, Hiroaki Ohshima, Katsumi Tanaka | ||
| Pages: 811-820 | ||
| doi>10.1145/2348283.2348392 | ||
|
Full text: |
||
|
We introduce the problem of domain adaptation for content-based retrieval and propose a domain adaptation method based on relative aggregation points (RAPs). Content-based retrieval including image retrieval and spoken document retrieval enables a user ...
expand
|
||
| Mixture model with multiple centralized retrieval algorithms for result merging in federated search | ||
| Dzung Hong, Luo Si | ||
| Pages: 821-830 | ||
| doi>10.1145/2348283.2348393 | ||
|
Full text: |
||
|
Result merging is an important research problem in federated search for merging documents retrieved from multiple ranked lists of selected information sources into a single list. The state-of-the-art result merging algorithms such as Semi-Supervised ...
expand
|
||
| Reactive index replication for distributed search engines | ||
| Flavio P. Junqueira, Vincent Leroy, Matthieu Morel | ||
| Pages: 831-840 | ||
| doi>10.1145/2348283.2348394 | ||
|
Full text: |
||
|
Distributed search engines comprise multiple sites deployed across geographically distant regions, each site being specialized to serve the queries of local users. When a search site cannot accurately compute the results of a query, it must forward the ...
expand
|
||
| SESSION: Diversity 2 | ||
| Personalized diversification of search results | ||
| David Vallet, Pablo Castells | ||
| Pages: 841-850 | ||
| doi>10.1145/2348283.2348396 | ||
|
Full text: |
||
|
Search personalization and diversification are often seen as opposing alternatives to cope with query uncertainty, where, given an ambiguous query, it is either preferable to adapt the search result to a specific aspect that may interest the user (personalization) ...
expand
|
||
| Combining implicit and explicit topic representations for result diversification | ||
| Jiyin He, Vera Hollink, Arjen de Vries | ||
| Pages: 851-860 | ||
| doi>10.1145/2348283.2348397 | ||
|
Full text: |
||
|
Result diversification deals with ambiguous or multi-faceted queries by providing documents that cover as many subtopics of a query as possible. Various approaches to subtopic modeling have been proposed. Subtopics have been extracted internally, e.g., ...
expand
|
||
| Using preference judgments for novel document retrieval | ||
| Praveen Chandar, Ben Carterette | ||
| Pages: 861-870 | ||
| doi>10.1145/2348283.2348398 | ||
|
Full text: |
||
|
There has been considerable interest in incorporating diversity in search results to account for redundancy and the space of possible user needs. Most work on this problem is based on subtopics: diversity rankers score documents against a set ...
expand
|
||
| SESSION: Evaluation 2 | ||
| Quality through flow and immersion: gamifying crowdsourced relevance assessments | ||
| Carsten Eickhoff, Christopher G. Harris, Arjen P. de Vries, Padmini Srinivasan | ||
| Pages: 871-880 | ||
| doi>10.1145/2348283.2348400 | ||
|
Full text: |
||
|
Crowdsourcing is a market of steadily-growing importance upon which both academia and industry increasingly rely. However, this market appears to be inherently infested with a significant share of malicious workers who try to maximise their profits through ...
expand
|
||
| An IR-based evaluation framework for web search query segmentation | ||
| Rishiraj Saha Roy, Niloy Ganguly, Monojit Choudhury, Srivatsan Laxman | ||
| Pages: 881-890 | ||
| doi>10.1145/2348283.2348401 | ||
|
Full text: |
||
|
This paper presents the first evaluation framework for Web search query segmentation based directly on IR performance. In the past, segmentation strategies were mainly validated against manual annotations. Our work shows that the goodness of a segmentation ...
expand
|
||
| On per-topic variance in IR evaluation | ||
| Stephen E. Robertson, Evangelos Kanoulas | ||
| Pages: 891-900 | ||
| doi>10.1145/2348283.2348402 | ||
|
Full text: |
||
|
We explore the notion, put forward by Cormack & Lynam and Robertson, that we should consider a document collection used for Cranfield-style experiments as a sample from some larger population of documents. In this view, any per-topic metric (such ...
expand
|
||
| An uncertainty-aware query selection model for evaluation of IR systems | ||
| Mehdi Hosseini, Ingemar J. Cox, Natasa Milic-Frayling, Milad Shokouhi, Emine Yilmaz | ||
| Pages: 901-910 | ||
| doi>10.1145/2348283.2348403 | ||
|
Full text: |
||
|
We propose a mathematical framework for query selection as a mechanism for reducing the cost of constructing information retrieval test collections. In particular, our mathematical formulation explicitly models the uncertainty in the retrieval effectiveness ...
expand
|
||
| SESSION: Representation | ||
| Improving retrieval of short texts through document expansion | ||
| Miles Efron, Peter Organisciak, Katrina Fenlon | ||
| Pages: 911-920 | ||
| doi>10.1145/2348283.2348405 | ||
|
Full text: |
||
|
Collections containing a large number of short documents are becoming increasingly common. As these collections grow in number and size, providing effective retrieval of brief texts presents a significant research problem. We propose a novel approach ...
expand
|
||
| Extending BM25 with multiple query operators | ||
| Roi Blanco, Paolo Boldi | ||
| Pages: 921-930 | ||
| doi>10.1145/2348283.2348406 | ||
|
Full text: |
||
|
Traditional probabilistic relevance frameworks for informational retrieval refrain from taking positional information into account, due to the hurdles of developing a sound model while avoiding an explosion in the number of parameters. Nonetheless, the ...
expand
|
||
| Rhetorical relations for information retrieval | ||
| Christina Lioma, Birger Larsen, Wei Lu | ||
| Pages: 931-940 | ||
| doi>10.1145/2348283.2348407 | ||
|
Full text: |
||
|
Typically, every part in most coherent text has some plausible reason for its presence, some function that it performs to the overall semantics of the text. Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts of a text are ...
expand
|
||
| Modeling higher-order term dependencies in information retrieval using query hypergraphs | ||
| Michael Bendersky, W. Bruce Croft | ||
| Pages: 941-950 | ||
| doi>10.1145/2348283.2348408 | ||
|
Full text: |
||
|
Many of the recent, and more effective, retrieval models have incorporated dependencies between the terms in the query. In this paper, we advance this query representation one step further, and propose a retrieval framework that models higher-order term ...
expand
|
||
| SESSION: Classification | ||
| Confidence-aware graph regularization with heterogeneous pairwise features | ||
| Yuan Fang, Bo-June (Paul) Hsu, Kevin Chen-Chuan Chang | ||
| Pages: 951-960 | ||
| doi>10.1145/2348283.2348410 | ||
|
Full text: |
||
|
Conventional classification methods tend to focus on features of individual objects, while missing out on potentially valuable pairwise features that capture the relationships between objects. Although recent developments on graph regularization exploit ...
expand
|
||
| A utility-theoretic ranking method for semi-automated text classification | ||
| Giacomo Berardi, Andrea Esuli, Fabrizio Sebastiani | ||
| Pages: 961-970 | ||
| doi>10.1145/2348283.2348411 | ||
|
Full text: |
||
|
In Semi-Automated Text Classification (SATC) an automatic classifier F labels a set of unlabelled documents D, following which a human annotator inspects (and corrects when appropriate) the labels attributed by F to a subset D' of D, with the aim of ...
expand
|
||
| Improving tweet stream classification by detecting changes in word probability | ||
| Kyosuke Nishida, Takahide Hoshide, Ko Fujimura | ||
| Pages: 971-980 | ||
| doi>10.1145/2348283.2348412 | ||
|
Full text: |
||
|
We propose a classification model of tweet streams in Twitter, which are representative of document streams whose statistical properties will change over time. Our model solves several problems that hinder the classification of tweets; in particular, ...
expand
|
||
| Predicting quality flaws in user-generated content: the case of wikipedia | ||
| Maik Anderka, Benno Stein, Nedim Lipka | ||
| Pages: 981-990 | ||
| doi>10.1145/2348283.2348413 | ||
|
Full text: |
||
|
The detection and improvement of low-quality information is a key concern in Web applications that are based on user-generated content; a popular example is the online encyclopedia Wikipedia. Existing research on quality assessment of user-generated ...
expand
|
||
| SESSION: Doctoral submissions | ||
| A knowledge-based approach for summarising opinions | ||
| Marco Bonzanini | ||
| Pages: 991-991 | ||
| doi>10.1145/2348283.2348415 | ||
|
Full text: |
||
|
Automatic Document Summarisation plays a central role in the process of providing the user with a quick access to information. Applications range from the generation of news headlines, to the aggregation of opinions extracted from reviews. Traditional ...
expand
|
||
| Adaptive IR for exploratory search support | ||
| Daniel T.J. Backhausen | ||
| Pages: 992-992 | ||
| doi>10.1145/2348283.2348416 | ||
|
Full text: |
||
|
Most Information Retrieval (IR) software is designed to fit a general user where users are submitting queries and the retrieval system returns a ranked list of results. Regardless of the user, the query always returns the same list of results. Individual ...
expand
|
||
| Adversarial content manipulation effects | ||
| Fiana Raiber | ||
| Pages: 993-993 | ||
| doi>10.1145/2348283.2348417 | ||
|
Full text: |
||
|
We address a question that has been somewhat overlooked throughout the transition from classical ad hoc retrieval to Web search: how is the performance of classical retrieval approaches affected by the presence of content manipulation? Our initial experiments ...
expand
|
||
| Building reputation and trust using federated search and opinion mining | ||
| Somayeh Khatiban | ||
| Pages: 994-994 | ||
| doi>10.1145/2348283.2348418 | ||
|
Full text: |
||
|
The term online reputation addresses trust relationships amongst agents in dynamic open systems. These can appear as ratings, recommendations, referrals and feedback. Several reputation models and rating aggregation algorithms have been proposed. However, ...
expand
|
||
| Enhancing knowledge base with knowledge transfer | ||
| Si-Chi Chin | ||
| Pages: 995-995 | ||
| doi>10.1145/2348283.2348419 | ||
|
Full text: |
||
|
A Knowledge Base (KB) stores, organizes, and shares information pertinent to entities (i.e. KB nodes) such as people, organizations, and events. A large KB system, such as Wikipedia, relies on human curators to create and maintain the content in the ...
expand
|
||
| Improving e-discovery using information retrieval | ||
| Kripabandhu Ghosh | ||
| Pages: 996-996 | ||
| doi>10.1145/2348283.2348420 | ||
|
Full text: |
||
|
E-discovery is the requirement that the documents and information in electronic form stored in corporate systems be produced as evidence in litigation. It has posed great challenges for legal experts. Legal searchers have always looked to find "any and ...
expand
|
||
| Opinion influence and diffusion in social network | ||
| Dehong Gao | ||
| Pages: 997-997 | ||
| doi>10.1145/2348283.2348421 | ||
|
Full text: |
||
|
Nowadays, more and more people tend to make decisions based on the opinion information from the Internet, in addition to recommendations from offline friends or parents. For example, we may browse the resumes and comments on election candidates to determine ...
expand
|
||
| Relevance as a subjective and situational multidimensional concept | ||
| Carsten Eickhoff | ||
| Pages: 998-998 | ||
| doi>10.1145/2348283.2348422 | ||
|
Full text: |
||
|
Relevance is the central concept of information retrieval. Although its important role is unanimously accepted among researchers, numerous different definitions of the term have emerged over the years. Considerable effort has been put into creating consistent ...
expand
|
||
| Exploiting temporal topic models in social media retrieval | ||
| Tuan A. Tran | ||
| Pages: 999-999 | ||
| doi>10.1145/2348283.2348423 | ||
|
Full text: |
||
|
Many of user generated contents in the Web 2.0 center around real-world incidents such as Japanese tsunami, or general concerns such as recent economic downturn. Such type of information is always of interest to users. For instance, when a user reads ...
expand
|
||
| The essence of time: considering temporal relevance as an intent-aware ranking problem | ||
| Stewart Whiting | ||
| Pages: 1000-1000 | ||
| doi>10.1145/2348283.2348424 | ||
|
Full text: |
||
|
Real-time news and social media quickly reflect large-scale phenomena and events. As users become exposed to this information, time plays a central role in prompting both information authorship and seeking activities. The objective of this research is ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations | ||
| A framework for manipulating and searching multiple retrieval types | ||
| Marc-Allen Cartright, Ethem F. Can, William Dabney, Jeff Dalton, Logan Giorda, Kriste Krstovski, Xiaoye Wu, Ismet Zeki Yalniz, James Allan, R. Manmatha, David A. Smith | ||
| Pages: 1001-1001 | ||
| doi>10.1145/2348283.2348426 | ||
|
Full text: |
||
|
Conventional retrieval systems view documents as a unit and look at different retrieval types within a document. We introduce Proteus, a frame-work for seamlessly navigating books as dynamic collections which are defined on the fly. Proteus allows us ...
expand
|
||
| A visual tool for bayesian data analysis: the impact of smoothing on naive bayes text classifiers | ||
| Giorgio Maria Di Nunzio, Alessandro Sordoni | ||
| Pages: 1002-1002 | ||
| doi>10.1145/2348283.2348427 | ||
|
Full text: |
||
|
Naive-Bayes (NB) classifiers are simple probabilistic classifiers still widely used in supervised learning due to their tradeoff between efficient model training and good empirical results. One of the drawbacks of these classifiers is that in situations ...
expand
|
||
| ALF: a client side logger and server for capturing user interactions in web applications | ||
| Leif Azzopardi, Myles Doolan, Richard Glassey | ||
| Pages: 1003-1003 | ||
| doi>10.1145/2348283.2348428 | ||
|
Full text: |
||
|
This demonstration paper introduces ALF which provides a light-weight client side logging application and a server for collecting user interaction data. ALF has been designed as a loosely coupled independent service that runs in parallel with the IR ...
expand
|
||
| ChatNoir: a search engine for the ClueWeb09 corpus | ||
| Martin Potthast, Matthias Hagen, Benno Stein, Jan Graßegger, Maximilian Michel, Martin Tippmann, Clement Welsch | ||
| Pages: 1004-1004 | ||
| doi>10.1145/2348283.2348429 | ||
|
Full text: |
||
|
We present the ChatNoir search engine which indexes the entire English part of the ClueWeb09 corpus. Besides Carnegie Mellon's Indri system, ChatNoir is the second publicly available search engine for this corpus. It implements the classic BM25F information ...
expand
|
||
| CrowdTerrier: automatic crowdsourced relevance assessments with terrier | ||
| Richard McCreadie, Craig Macdonald, Iadh Ounis | ||
| Pages: 1005-1005 | ||
| doi>10.1145/2348283.2348430 | ||
|
Full text: |
||
|
In this demo, we present CrowdTerrier, an infrastructure extension to the open source Terrier IR platform that enables the semi-automatic generation of relevance assessments for a variety of document ranking tasks using crowdsourcing. The aim of CrowdTerrier ...
expand
|
||
| Distilling and exploring nuggets from a corpus | ||
| Vittorio Castelli, Hema Raghavan, Radu Florian, Ding-Jung Han, Xiaoqiang Luo, Salim Roukos | ||
| Pages: 1006-1006 | ||
| doi>10.1145/2348283.2348431 | ||
|
Full text: |
||
|
This paper describes a live and scalable system that automatically extracts information nuggets for entities/topics from a continuously updated corpus for effective exploration and analysis. A nugget is a piece of semantic information that (1) must be ...
expand
|
||
| Integrative online research-data management | ||
| Michael Huggett, Edie Rasmussen | ||
| Pages: 1007-1007 | ||
| doi>10.1145/2348283.2348432 | ||
|
Full text: |
||
|
In support of our research projects in information retrieval, we have developed an integrated multi-process software system that shepherds research data from induction through aggregation, analysis, and presentation. We combine public-domain code libraries ...
expand
|
||
| MaSe: create your own mash-up search interface | ||
| Leif Azzopardi, Douglas Dowie, Kelly Ann Marshall, Richard Glassey | ||
| Pages: 1008-1008 | ||
| doi>10.1145/2348283.2348433 | ||
|
Full text: |
||
|
MaSe provides a sandbox environment for high school students to create their own personalised search interface. It has been designed with two major goals in mind: (1) as a hands-on tutorial for school children, to excite them about programming and computing ...
expand
|
||
| myDJ: recommending karaoke songs from one's own voice | ||
| Kuang Mao, Xinyuan Luo, Ke Chen, Gang Chen, Lidan Shou | ||
| Pages: 1009-1009 | ||
| doi>10.1145/2348283.2348434 | ||
|
Full text: |
||
|
In this demo, we present myDJ, a karaoke recommendation system which recommends the songs people are capable to sing. Different from the existing song recommendation systems which recommend songs people like to listen, myDJ can recommend proper songs ...
expand
|
||
| PageFetch: a retrieval game for children (and adults) | ||
| Leif Azzopardi, Jim Purvis, Richard Glassey | ||
| Pages: 1010-1010 | ||
| doi>10.1145/2348283.2348435 | ||
|
Full text: |
||
|
Children often struggle with information retrieval tasks as searching for information often requires a developed vocabulary and strong categorisation skills; neither of which are particularly developed in children under the age of 12. In a study conducted ...
expand
|
||
| Pictune: situational music recommendation from geotagged pictures | ||
| Ke Chen, Gang Chen, Lidan Shou, Fei Xia | ||
| Pages: 1011-1011 | ||
| doi>10.1145/2348283.2348436 | ||
|
Full text: |
||
| Political search trends | ||
| Ingmar Weber, Venkata Rama Kiran Garimella, Erik Borra | ||
| Pages: 1012-1012 | ||
| doi>10.1145/2348283.2348437 | ||
|
Full text: |
||
|
We present Political Search Trends, a browser based web search analysis tool that (i) assigns a political leaning to web search queries, (ii) detects trending political queries in a given week, and (iii) links search queries to fact-checked statements. ...
expand
|
||
| RDF Xpress: a flexible expressive RDF search engine | ||
| Shady Elbassuoni, Maya Ramanath, Gerhard Weikum | ||
| Pages: 1013-1013 | ||
| doi>10.1145/2348283.2348438 | ||
|
Full text: |
||
|
We demonstrate RDF Xpress, a search engine that enables users to effectively retrieve information from large RDF knowledge bases or Linked Data Sources. RDF Xpress provides a search interface where users can combine triple patterns with keywords to form ...
expand
|
||
| Sketch-based image similarity search with a pen and paper interface | ||
| Ihab Al Kabary, Heiko Schuldt | ||
| Pages: 1014-1014 | ||
| doi>10.1145/2348283.2348439 | ||
|
Full text: |
||
|
We present a novel and innovative user interface for query-by-sketching based image retrieval that exploits emergent interactive paper and digital pen technology. Users can draw sketches with a digital pen on interactive paper in a user-friendly way. ...
expand
|
||
| Task-aware search assistant | ||
| Henry Allen Feild, James Allan | ||
| Pages: 1015-1015 | ||
| doi>10.1145/2348283.2348440 | ||
|
Full text: |
||
| TweetSpector: entity-based retrieval of tweets | ||
| Surender Reddy Yerva, Zoltan Miklos, Flavia Grosan, Alexandru Tandrau, Karl Aberer | ||
| Pages: 1016-1016 | ||
| doi>10.1145/2348283.2348441 | ||
|
Full text: |
||
|
TweetSpector is a tool for demonstrating entity-based of retrieval of tweets. The various features of this tool include: entity profile creation, real-time tweet classification, active improvement of the created profiles through user feedback, and the ...
expand
|
||
| YooSee: a video browsing application for young children | ||
| Leif Azzopardi, Douglas Dowie, Kelly Ann Marshall | ||
| Pages: 1017-1017 | ||
| doi>10.1145/2348283.2348442 | ||
|
Full text: |
||
|
Nowadays children as young as two years old can easily interact with mobile touch screen devices and personal computers to watch online videos through services such as YouTube. However, such services present a number of challenges for young children ...
expand
|
||
| Multi-platform image search using tag enrichment | ||
| Jinming Min, Cristover Lopes, Johannes Leveling, Dag Schmidtke, Gareth J.F. Jones | ||
| Pages: 1018-1018 | ||
| doi>10.1145/2348283.2348443 | ||
|
Full text: |
||
|
The number of images available online is growing steadily and current web search engines have indexed more than 10 billion images. Approaches to image retrieval are still often text-based and operate on image annotations and captions. Image annotations ...
expand
|
||
| SESSION: Industry talk abstracts | ||
| IR paradigms in computational advertising | ||
| Andrei Z. Broder | ||
| Pages: 1019-1019 | ||
| doi>10.1145/2348283.2348445 | ||
|
Full text: |
||
|
The central problem in the emerging discipline of computational advertising is to find the "best match" between a given user in a given context and a suitable advertisement. The context could be a user entering a query in a search engine ("sponsored ...
expand
|
||
| Watson: the Jeopardy! challenge and beyond | ||
| Eric W. Brown | ||
| Pages: 1020-1020 | ||
| doi>10.1145/2348283.2348446 | ||
|
Full text: |
||
|
Watson, named after IBM founder Thomas J. Watson, was built by a team of IBM researchers who set out to accomplish a grand challenge-build a computing system that rivals a human's ability to answer questions posed in natural language with speed, accuracy ...
expand
|
||
| Putting context into search and search into context | ||
| Susan T. Dumais | ||
| Pages: 1021-1021 | ||
| doi>10.1145/2348283.2348447 | ||
|
Full text: |
||
|
It is very challenging task to understand a short query, especially if that query is considered in isolation. Luckily, queries do magically appear in a search box -- rather, they are issued by real people, trying to accomplish a task, at a given point ...
expand
|
||
| CloudSearch and the democratization of information retrieval | ||
| Daniel E. Rose | ||
| Pages: 1022-1023 | ||
| doi>10.1145/2348283.2348448 | ||
|
Full text: |
||
|
Amazon CloudSearch is a new hosted search service, built on top of many cloud-based AWS services, and based on the same technology that powers search on Amazon's retail sites. Because of its ease of configuration and scalability, CloudSearch represents ...
expand
|
||
| Entity sentiment extraction using text ranking | ||
| John O'Neil | ||
| Pages: 1024-1024 | ||
| doi>10.1145/2348283.2348449 | ||
|
Full text: |
||
|
Entity extraction and sentiment classification are among the most common types of information derived from documents, but the problem of directly associating entities and sentiment has received less attention. We use TextRank on a graph linking entities ...
expand
|
||
| POSTER SESSION: Poster abstracts | ||
| A hybrid model for ad-hoc information retrieval | ||
| Zheng Ye, Jimmy Xiangji Huang, Jun Miao | ||
| Pages: 1025-1026 | ||
| doi>10.1145/2348283.2348451 | ||
|
Full text: |
||
|
Many information retrieval (IR) techniques have been proposed to improve the performance, and some combinations of these techniques has been demonstrated to be effective. However, how to effectively combine them is largely unexplored. It is possible ...
expand
|
||
| Exploiting paths for entity search in RDF graphs | ||
| Minsuk Kahng, Sang-goo Lee | ||
| Pages: 1027-1028 | ||
| doi>10.1145/2348283.2348452 | ||
|
Full text: |
||
|
The field of entity search using Semantic Web (RDF) data has gained more interest recently. In this paper, we propose a probabilistic entity retrieval model for RDF graphs using paths in the graph. Unlike previous work which assumes that all descriptions ...
expand
|
||
| A study of term weighting schemes using class information for text classification | ||
| Youngjoong Ko | ||
| Pages: 1029-1030 | ||
| doi>10.1145/2348283.2348453 | ||
|
Full text: |
||
| A topic model of clinical reports | ||
| Corey Arnold, William Speier | ||
| Pages: 1031-1032 | ||
| doi>10.1145/2348283.2348454 | ||
|
Full text: |
||
|
Clinical narrative in the medical record provides perhaps the most detailed account of a patient's history. However, this information is documented in free-text, which makes it challenging to analyze. Efforts to index unstructured clinical narrative ...
expand
|
||
| Active query selection for learning rankers | ||
| Mustafa Bilgic, Paul N. Bennett | ||
| Pages: 1033-1034 | ||
| doi>10.1145/2348283.2348455 | ||
|
Full text: |
||
|
Methods that reduce the amount of labeled data needed for training have focused more on selecting which documents to label than on which queries should be labeled. One exception to this (Long et al. 2010) uses expected loss optimization (ELO) to estimate ...
expand
|
||
| Anticipatory search: using context to initiate search | ||
| Daniel J. Liebling, Paul N. Bennett, Ryen W. White | ||
| Pages: 1035-1036 | ||
| doi>10.1145/2348283.2348456 | ||
|
Full text: |
||
|
Identifying content for which a user may search has a variety of applications, including ranking and recommendation. In this poster, we examine how pre-search context can be used to predict content that the user will seek before they have even specified ...
expand
|
||
| BReK12: a book recommender for K-12 users | ||
| Maria Soledad Pera, Yiu-Kai Ng | ||
| Pages: 1037-1038 | ||
| doi>10.1145/2348283.2348457 | ||
|
Full text: |
||
|
Ideally, students in K-12 grade levels can turn to book recommenders to locate books that match their interests. Existing book recommenders, however, fail to take into account the readability levels of their users, and hence their recommendations may ...
expand
|
||
| Clarity re-visited | ||
| Shay Hummel, Anna Shtok, Fiana Raiber, Oren Kurland, David Carmel | ||
| Pages: 1039-1040 | ||
| doi>10.1145/2348283.2348458 | ||
|
Full text: |
||
|
We present a novel interpretation of Clarity [5], a widely used query performance predictor. While Clarity is commonly described as a measure of the "distance" between the language model of the top-retrieved documents and that of the collection, we show ...
expand
|
||
| Cluster-based one-class ensemble for classification problems in information retrieval | ||
| Nedim Lipka, Benno Stein, Maik Anderka | ||
| Pages: 1041-1042 | ||
| doi>10.1145/2348283.2348459 | ||
|
Full text: |
||
|
A number of relevant information retrieval classification problems are one-class classification problems at heart. I.e., labeled data is only available for one class, the so-called target class, and common discrimination-based classification approaches, ...
expand
|
||
| Collaborative filtering with short term preferences mining | ||
| Diyi Yang, Tianqi Chen, Weinan Zhang, Yong Yu | ||
| Pages: 1043-1044 | ||
| doi>10.1145/2348283.2348460 | ||
|
Full text: |
||
|
Recently, recommender systems have fascinated researchers and benefited a variety of people's online activities, enabling users to survive the explosive web information. Traditional collaborative filtering techniques handle the general recommendation ...
expand
|
||
| Creating temporally dynamic web search snippets | ||
| Krysta M. Svore, Jaime Teevan, Susan T. Dumais, Anagha Kulkarni | ||
| Pages: 1045-1046 | ||
| doi>10.1145/2348283.2348461 | ||
|
Full text: |
||
|
Content on the Internet is always changing. We explore the value of biasing search result snippets towards new webpage content. We present results from a user study comparing traditional query-focused snippets with snippets that emphasize new page content ...
expand
|
||
| Dependency trigram model for social relation extraction from news articles | ||
| Maengsik Choi, Harksoo Kim, Bruce W. Croft | ||
| Pages: 1047-1048 | ||
| doi>10.1145/2348283.2348462 | ||
|
Full text: |
||
|
We propose a kernel-based model to automatically extract social relations such as economic relations and political relations between two people from news articles. To determine whether two people are structurally associated with each other, the proposed ...
expand
|
||
| Detecting candidate named entities in search queries | ||
| Areej Alasiry, Mark Levene, Alexandra Poulovassilis | ||
| Pages: 1049-1050 | ||
| doi>10.1145/2348283.2348463 | ||
|
Full text: |
||
|
The information extraction task of Named Entities Recognition (NER) has been recently applied to search engine queries, in order to better understand their semantics. Here we concentrate on the task prior to the classification of the named entities ...
expand
|
||
| Effect of dynamic pruning safety on learning to rank effectiveness | ||
| Craig Macdonald, Nicola Tonellotto, Iadh Ounis | ||
| Pages: 1051-1052 | ||
| doi>10.1145/2348283.2348464 | ||
|
Full text: |
||
|
A dynamic pruning strategy, such as WAND, enhances retrieval efficiency without degrading effectiveness to a given rank K, known as safe-to-rank-K. However, it is also possible for WAND to obtain more efficient but unsafe retrieval without actually significantly ...
expand
|
||
| Effect of written instructions on assessor agreement | ||
| William Webber, Bryan Toth, Marjorie Desamito | ||
| Pages: 1053-1054 | ||
| doi>10.1145/2348283.2348465 | ||
|
Full text: |
||
|
Assessors frequently disagree on the topical relevance of documents. How much of this disagreement is due to ambiguity in assessment instructions? We have two assessors assess TREC Legal Track documents for relevance, some to a general topic description, ...
expand
|
||
| Effects of expertise differences in synchronous social Q&A | ||
| Ryen W. White, Matthew Richardson | ||
| Pages: 1055-1056 | ||
| doi>10.1145/2348283.2348466 | ||
|
Full text: |
||
|
Synchronous social question-and-answer (Q&A) systems match askers to answerers and support real-time dialog between them to resolve questions. These systems typically find answerers based on the degree of expertise match with the asker's initial ...
expand
|
||
| Efficient estimation of aspect weights | ||
| Jon Parker, Andrew Yates, Nazli Goharian, Wai Gen Yee | ||
| Pages: 1057-1058 | ||
| doi>10.1145/2348283.2348467 | ||
|
Full text: |
||
|
Many websites encourage people to submit reviews of various products and services. We present and evaluate a novel approach to efficiently model and analyze the text within user reviews to estimate how much reviewers care about different aspects of a ...
expand
|
||
| Emotion tagging for comments of online news by meta classification with heterogeneous information sources | ||
| Ying Zhang, Yi Fang, Xiaojun Quan, Lin Dai, Luo Si, Xiaojie Yuan | ||
| Pages: 1059-1060 | ||
| doi>10.1145/2348283.2348468 | ||
|
Full text: |
||
|
With the rapid growth of online news services, users can actively respond to online news by making comments. Users often express subjective emotions in comments such as sadness, surprise and anger. Such emotions can help understand the preferences and ...
expand
|
||
| Estimating the magic barrier of recommender systems: a user study | ||
| Alan Said, Brijnesh J. Jain, Sascha Narr, Till Plumbaum, Sahin Albayrak, Christian Scheel | ||
| Pages: 1061-1062 | ||
| doi>10.1145/2348283.2348469 | ||
|
Full text: |
||
|
Recommender systems are commonly evaluated by trying to predict known, withheld, ratings for a set of users. Measures such as the Root-Mean-Square Error are used to estimate the quality of the recommender algorithms. This process does however not acknowledge ...
expand
|
||
| Explaining neighborhood-based recommendations | ||
| Sergio Cleger-Tamayo, Juan M. Fernandez-Luna, Juan F. Huete | ||
| Pages: 1063-1064 | ||
| doi>10.1145/2348283.2348470 | ||
|
Full text: |
||
|
Recommender Systems (RS) attempt to discover users' preferences, and to learn about them in order to anticipate their needs. The main task normally associated with a RS is to offer suggestions for items. However, for most users, RSs are black boxes, ...
expand
|
||
| Exploiting term dependence while handling negation in medical search | ||
| Nut Limsopatham, Craig Macdonald, Richard McCreadie, Iadh Ounis | ||
| Pages: 1065-1066 | ||
| doi>10.1145/2348283.2348471 | ||
|
Full text: |
||
|
In medical records, negative qualifiers, e.g. no or without, are commonly used by health practitioners to identify the absence of a medical condition. Without considering whether the term occurs in a negative or positive context, the sole presence of ...
expand
|
||
| Exploring example-based person search in email | ||
| Tan Xu, Douglas W. Oard | ||
| Pages: 1067-1068 | ||
| doi>10.1145/2348283.2348472 | ||
|
Full text: |
||
|
This paper describes an entity ranking model for example-based person search in email. Evaluation by comparison to manually resolved named references in Enron email yield results that correspond to typically placing the correct entity in the first or ...
expand
|
||
| Exploring tag relevance for image tag re-ranking | ||
| Jie Xiao, Wengang Zhou, Qi Tian | ||
| Pages: 1069-1070 | ||
| doi>10.1145/2348283.2348473 | ||
|
Full text: |
||
|
In this paper, we propose to explore the relevance between tags for image tag re-ranking. The key component is to define a global tag-tag similarity matrix, which is achieved by analysis in both semantic and visual aspects. The text semantic relevance ...
expand
|
||
| Fast on-line learning for multilingual categorization | ||
| Michelle Kovesi, Cyril Goutte, Massih-Reza Amini | ||
| Pages: 1071-1072 | ||
| doi>10.1145/2348283.2348474 | ||
|
Full text: |
||
|
Multiview learning has been shown to be a natural and efficient framework for supervised or semi-supervised learning of multilingual document categorizers. The state-of-the-art co-regularization approach relies on alternate minimizations of a combination ...
expand
|
||
| Finding interesting posts in Twitter based on retweet graph analysis | ||
| Min-Chul Yang, Jung-Tae Lee, Seung-Wook Lee, Hae-Chang Rim | ||
| Pages: 1073-1074 | ||
| doi>10.1145/2348283.2348475 | ||
|
Full text: |
||
|
Millions of posts are being generated in real-time by users in social networking services, such as Twitter. However, a considerable number of those posts are mundane posts that are of interest to the authors and possibly their friends only. This paper ...
expand
|
||
| Finding readings for scientists from social websites | ||
| Jiepu Jiang, Zhen Yue, Shuguang Han, Daqing He | ||
| Pages: 1075-1076 | ||
| doi>10.1145/2348283.2348476 | ||
|
Full text: |
||
|
Current search systems are designed to find relevant articles, especially topically relevant ones, but the notion of relevance largely depends on search tasks. We study the specific task that scientists are searching for worth-reading articles beneficial ...
expand
|
||
| Finding web appearances of social network users via latent factor model | ||
| Kailong Chen, Zhengdong Lu, Xiaoshi Yin, Yong Yu, Zaiqing Nie | ||
| Pages: 1077-1078 | ||
| doi>10.1145/2348283.2348477 | ||
|
Full text: |
||
|
With the rapid growing of Web 2.0, people spend more time on social networks such as Facebook and Twitter. In order to know the people they are interacting with, finding the web appearances of them will help the social network users to a great extent. ...
expand
|
||
| Fixed versus dynamic co-occurrence windows in TextRank term weights for information retrieval | ||
| Wei Lu, Qikai Cheng, Christina Lioma | ||
| Pages: 1079-1080 | ||
| doi>10.1145/2348283.2348478 | ||
|
Full text: |
||
|
TextRank is a variant of PageRank typically used in graphs that represent documents, and where vertices denote terms and edges denote relations between terms. Quite often the relation between terms is simple term co-occurrence within a fixed window of ...
expand
|
||
| Gender-aware re-ranking | ||
| Eugene Kharitonov, Pavel Serdyukov | ||
| Pages: 1081-1082 | ||
| doi>10.1145/2348283.2348479 | ||
|
Full text: |
||
|
In this paper we study usefulness of users' gender information for improving ranking of ambiguous queries in personalized and non-contextual settings. This study is performed as a sequence of offline re-ranking experiments and it demonstrates that the ...
expand
|
||
| Genre classification for million song dataset using confidence-based classifiers combination | ||
| Yajie Hu, Mitsunori Ogihara | ||
| Pages: 1083-1084 | ||
| doi>10.1145/2348283.2348480 | ||
|
Full text: |
||
|
We proposed a method to classify songs in the Million Song Dataset according to song genre. Since songs have several data types, we trained sub-classifiers by different types of data. These sub-classifiers are combined using both classifier authority ...
expand
|
||
| GLASE 0.1: eyes tell more than mice | ||
| Viktors Garkavijs, Mayumi Toshima, Noriko Kando | ||
| Pages: 1085-1086 | ||
| doi>10.1145/2348283.2348481 | ||
|
Full text: |
||
|
This paper proposes a prototype system called Gaze-Learning-Access-and-Search-Engine 0.1 (GLASE), which can perform image relevance ranking based on gaze data and within-session learning. We developed a search user interface that uses an eye-tracker ...
expand
|
||
| How query extensions reflect search result abandonments | ||
| Aleksandr Chuklin, Pavel Serdyukov | ||
| Pages: 1087-1088 | ||
| doi>10.1145/2348283.2348482 | ||
|
Full text: |
||
|
It is often considered that high abandonment rate corresponds to poor IR system performance. However several studies suggested that there are so called good abandonments, i.e. situations when search engine result page contains enough details to ...
expand
|
||
| Identifying entity aspects in microblog posts | ||
| Damiano Spina, Edgar Meij, Maarten de Rijke, Andrei Oghina, Minh Thuong Bui, Mathias Breuss | ||
| Pages: 1089-1090 | ||
| doi>10.1145/2348283.2348483 | ||
|
Full text: |
||
|
Online reputation management is about monitoring and handling the public image of entities (such as companies) on the Web. An important task in this area is identifying "aspects" of the entity of interest (such as products, services, competitors, key ...
expand
|
||
| Impact of assessor disagreement on ranking performance | ||
| Pavel Metrikov, Virgil Pavlu, Javed A. Aslam | ||
| Pages: 1091-1092 | ||
| doi>10.1145/2348283.2348484 | ||
|
Full text: |
||
|
We consider the impact of inter-assessor disagreement on the maximum performance that a ranker can hope to achieve. We demonstrate that even if a ranker were to achieve perfect performance with respect to a given assessor, when evaluated with respect ...
expand
|
||
| Incorporating statistical topic information in relevance feedback | ||
| Karla L. Caballero, Ram Akella | ||
| Pages: 1093-1094 | ||
| doi>10.1145/2348283.2348485 | ||
|
Full text: |
||
|
Most of the relevance feedback algorithms only use document terms as feedback (local features) in order to update the query and re-rank the documents to show to the user. This approach is limited by the terms of those documents without any global context. ...
expand
|
||
| Inferring missing relevance judgments from crowd workers via probabilistic matrix factorization | ||
| Hyun Joon Jung, Matthew Lease | ||
| Pages: 1095-1096 | ||
| doi>10.1145/2348283.2348486 | ||
|
Full text: |
||
|
In crowdsourced relevance judging, each crowd worker typically judges only a small number of examples, yielding a sparse and imbalanced set of judgments in which relatively few workers influence output consensus labels, particularly with simple consensus ...
expand
|
||
| Investigating performance predictors using monte carlo simulation and score distribution models | ||
| Ronan Cummins | ||
| Pages: 1097-1098 | ||
| doi>10.1145/2348283.2348487 | ||
|
Full text: |
||
|
The standard deviation of scores in the top k documents of a ranked list has been shown to be significantly correlated with average precision and has been the basis of a number of query performance predictors. In this paper, we outline two hypotheses ...
expand
|
||
| Learning to select a time-aware retrieval model | ||
| Nattiya Kanhabua, Klaus Berberich, Kjetil Nørvåg | ||
| Pages: 1099-1100 | ||
| doi>10.1145/2348283.2348488 | ||
|
Full text: |
||
|
Time-aware retrieval models exploit one of two time dimensions, namely, (a) publication time or (b) content time (temporal expressions mentioned in documents). We show that the effectiveness for a temporal query (e.g., illinois earthquake ...
expand
|
||
| Learning-based time-sensitive re-ranking for web search | ||
| Po-Tzu Chang, Yen-Chieh Huang, Cheng-Lun Yang, Shou-De Lin, Pu-Jen Cheng | ||
| Pages: 1101-1102 | ||
| doi>10.1145/2348283.2348489 | ||
|
Full text: |
||
|
To model time-dependent user intent for Web search, this paper proposes a novel method using machine learning techniques to exploit temporal features for effective time-sensitive search result re-ranking. We propose models to incorporate users' click ...
expand
|
||
| Lightweight contrastive summarization for news comment mining | ||
| Gobaan Raveendran, Charles L.A. Clarke | ||
| Pages: 1103-1104 | ||
| doi>10.1145/2348283.2348490 | ||
|
Full text: |
||
|
We develop and discuss a news comment miner that presents distinct viewpoints on a given theme or event. Given a query, the system uses metasearch techniques to find relevant news articles. Relevant articles are then scraped for both article content ...
expand
|
||
| Looking inside the box: context-sensitive translation for cross-language information retrieval | ||
| Ferhan Ture, Jimmy Lin, Douglas W. Oard | ||
| Pages: 1105-1106 | ||
| doi>10.1145/2348283.2348491 | ||
|
Full text: |
||
|
Cross-language information retrieval (CLIR) today is dominated by techniques that use token-to-token mappings from bilingual dictionaries. Yet, state-of-the-art statistical translation models (e.g., using Synchronous Context-Free Grammars) are far richer, ...
expand
|
||
| Making results fit into 40 characters: a study in document rewriting | ||
| Johannes Leveling, Gareth J.F. Jones | ||
| Pages: 1107-1108 | ||
| doi>10.1145/2348283.2348492 | ||
|
Full text: |
||
|
With the increasing popularity of mobile and hand-held devices, automatic approaches for adapting results to the limited screen size of mobile devices are becoming more important. Traditional approaches for reducing the length of textual results include ...
expand
|
||
| New assessment criteria for query suggestion | ||
| Zhongrui Ma, Yu Chen, Ruihua Song, Tetsuya Sakai, Jiaheng Lu, Ji-Rong Wen | ||
| Pages: 1109-1110 | ||
| doi>10.1145/2348283.2348493 | ||
|
Full text: |
||
|
Query suggestion is a useful tool to help users express their information needs by supplying alternative queries. When evaluating the effectiveness of query suggestion algorithms, many previous studies focus on measuring whether a suggestion query is ...
expand
|
||
| On automatically tagging web documents from examples | ||
| Nicholas Joel Woodward, Weijia Xu, Kent Norsworthy | ||
| Pages: 1111-1112 | ||
| doi>10.1145/2348283.2348494 | ||
|
Full text: |
||
|
An emerging need in information retrieval is to identify a set of documents conforming to an abstract description. This task presents two major challenges to existing methods of document retrieval and classification. First, similarity based on overall ...
expand
|
||
| On building a reusable Twitter corpus | ||
| Richard McCreadie, Ian Soboroff, Jimmy Lin, Craig Macdonald, Iadh Ounis, Dean McCullough | ||
| Pages: 1113-1114 | ||
| doi>10.1145/2348283.2348495 | ||
|
Full text: |
||
|
The Twitter real-time information network is the subject of research for information retrieval tasks such as real-time search. However, so far, reproducible experimentation on Twitter data has been impeded by restrictions imposed by the Twitter terms ...
expand
|
||
| On judgments obtained from a commercial search engine | ||
| Emine Yilmaz, Gabriella Kazai, Nick Craswell, Saied Mehrizi Tahaghoghi | ||
| Pages: 1115-1116 | ||
| doi>10.1145/2348283.2348496 | ||
|
Full text: |
||
|
In information retrieval, relevance judgments play an important role as they are required both for evaluating the quality of retrieval systems and for training learning to rank algorithms. In recent years, numerous papers have been published using judgments ...
expand
|
||
| On the mathematical relationship between expected n-call@k and the relevance vs. diversity trade-off | ||
| Kar Wai Lim, Scott Sanner, Shengbo Guo | ||
| Pages: 1117-1118 | ||
| doi>10.1145/2348283.2348497 | ||
|
Full text: |
||
|
It has been previously noted that optimization of the n-call@k relevance objective (i.e., a set-based objective that is 1 if at least n documents in a set of k are relevant, otherwise 0) encourages more result set diversification ...
expand
|
||
| On real-time ad-hoc retrieval evaluation | ||
| Stephen E. Robertson, Evangelos Kanoulas | ||
| Pages: 1119-1120 | ||
| doi>10.1145/2348283.2348498 | ||
|
Full text: |
||
|
Lab-based evaluations typically assess the quality of a retrieval system with respect to its ability to retrieve documents that are relevant to the information need of an end user. In a real-time search task however users not only wish to retrieve the ...
expand
|
||
| Opinion summarisation through sentence extraction: an investigation with movie reviews | ||
| Marco Bonzanini, Miguel Martinez-Alvarez, Thomas Roelleke | ||
| Pages: 1121-1122 | ||
| doi>10.1145/2348283.2348499 | ||
|
Full text: |
||
|
In on-line reviews, authors often use a short passage to describe the overall feeling about a product or a service. A review as a whole can mention many details not in line with the overall feeling, so capturing this key passage is important to understand ...
expand
|
||
| Optimizing parameters of the expected reciprocal rank | ||
| Yury Logachev, Pavel Serdyukov | ||
| Pages: 1123-1124 | ||
| doi>10.1145/2348283.2348500 | ||
|
Full text: |
||
|
Most popular IR metrics are parameterized. Usually parameters of these metrics are chosen on the basis of general considerations and not adjusted by experiments with real users. Particularly, the parameters of the Expected Reciprocal Rank measure are ...
expand
|
||
| Ousting ivory tower research: towards a web framework for providing experiments as a service | ||
| Tim Gollub, Benno Stein, Steven Burrows | ||
| Pages: 1125-1126 | ||
| doi>10.1145/2348283.2348501 | ||
|
Full text: |
||
|
With its close ties to the Web, the IR community is destined to leverage the dissemination and collaboration capabilities that the Web provides today. Especially with the advent of the software as a service principle, an IR community is conceivable that ...
expand
|
||
| Parallelizing ListNet training using spark | ||
| Shilpa Shukla, Matthew Lease, Ambuj Tewari | ||
| Pages: 1127-1128 | ||
| doi>10.1145/2348283.2348502 | ||
|
Full text: |
||
|
As ever-larger training sets for learning to rank are created, scalability of learning has become increasingly important to achieving continuing improvements in ranking accuracy. Exploiting independence of "summation form" computations, we show how each ...
expand
|
||
| Predicting lifespans of popular tweets in microblog | ||
| Shoubin Kong, Ling Feng, Guozheng Sun, Kan Luo | ||
| Pages: 1129-1130 | ||
| doi>10.1145/2348283.2348503 | ||
|
Full text: |
||
|
In microblog like Twitter, popular tweets are usually retweeted by many users. For different tweets, their lifespans (i.e., how long they will stay popular) vary. This paper presents a simple yet effective approach to predict the lifespans of ...
expand
|
||
| Preliminary study of technical terminology for the retrieval of scientific book metadata records | ||
| Birger Larsen, Christina Lioma, Ingo Frommholz, Hinrich Schütze | ||
| Pages: 1131-1132 | ||
| doi>10.1145/2348283.2348504 | ||
|
Full text: |
||
|
Books only represented by brief metadata (book records) are particularly hard to retrieve. One way of improving their retrieval is by extracting retrieval enhancing features from them. This work focusses on scientific (physics) book records. We ...
expand
|
||
| Queries without clicks: evaluating retrieval effectiveness based on user feedback | ||
| Athanasia Koumpouri, Vasiliki Simaki | ||
| Pages: 1133-1134 | ||
| doi>10.1145/2348283.2348505 | ||
|
Full text: |
||
|
Until recently, the lack of user activity on search results was perceived as a sign of user dissatisfaction from retrieval performance. However, recent studies have reported that some queries might not be followed by clicks to the content of the retrieved ...
expand
|
||
| Retrieval evaluation on focused tasks | ||
| Besnik Fetahu, Ralf Schenkel | ||
| Pages: 1135-1136 | ||
| doi>10.1145/2348283.2348506 | ||
|
Full text: |
||
|
Ranking of retrieval systems for focused tasks requires large number of relevance judgments. We propose an approach that minimizes the number of relevance judgments, where the performance measures are approximated using a Monte-Carlo sampling technique. ...
expand
|
||
| Rewarding term location information to enhance probabilistic information retrieval | ||
| Jiashu Zhao, Jimmy Xiangji Huang, Shicheng Wu | ||
| Pages: 1137-1138 | ||
| doi>10.1145/2348283.2348507 | ||
|
Full text: |
||
|
We investigate the effect of rewarding terms according to their locations in documents for probabilistic information retrieval. The intuition behind our approach is that a large amount of authors would summarize their ideas in some particular parts of ...
expand
|
||
| Scheduling queries across replicas | ||
| Ana Freire, Craig Macdonald, Nicola Tonellotto, Iadh Ounis, Fidel Cacheda | ||
| Pages: 1139-1140 | ||
| doi>10.1145/2348283.2348508 | ||
|
Full text: |
||
|
For increased efficiency, an information retrieval system can split its index into multiple shards, and then replicate these shards across many query servers. For each new query, an appropriate replica for each shard must be selected, such that the query ...
expand
|
||
| Re-examining search result snippet examination time for relevance estimation | ||
| Dmitry Lagun, Eugene Agichtein | ||
| Pages: 1141-1142 | ||
| doi>10.1145/2348283.2348509 | ||
|
Full text: |
||
|
Previous studies of web search result examination have provided valuable insights in understanding and modelling searcher behavior. Yet, recent work (e.g., [3]) has been developed based on the assumption that the time a searcher spends examining a particular ...
expand
|
||
| Sentiment identification by incorporating syntax, semantics and context information | ||
| Kunpeng Zhang, Yusheng Xie, Yu Cheng, Daniel Honbo, Doug Downey, Ankit Agrawal, Wei-keng Liao, Alok Choudhary | ||
| Pages: 1143-1144 | ||
| doi>10.1145/2348283.2348510 | ||
|
Full text: |
||
|
This paper proposes a method based on conditional random fields to incorporate sentence structure (syntax and semantics) and context information to identify sentiments of sentences within a document. It also proposes and evaluates two different active ...
expand
|
||
| Short text classification using very few words | ||
| Aixin Sun | ||
| Pages: 1145-1146 | ||
| doi>10.1145/2348283.2348511 | ||
|
Full text: |
||
|
We propose a simple, scalable, and non-parametric approach for short text classification. Leveraging the well studied and scalable Information Retrieval (IR) framework, our approach mimics human labeling process for a piece of short text. It first selects ...
expand
|
||
| Summarizing the differences from microblogs | ||
| Dingding Wang, Mitsunori Ogihara, Tao Li | ||
| Pages: 1147-1148 | ||
| doi>10.1145/2348283.2348512 | ||
|
Full text: |
||
|
With the rapid growth of social media websites, microblogging has become a popular way to spread instant news and events. Due to the dynamic and social nature of microblogs, extracting useful information from microblogs is more challenging than from ...
expand
|
||
| Survival analysis of click logs | ||
| Si-Chi Chin, W. Nick Street | ||
| Pages: 1149-1150 | ||
| doi>10.1145/2348283.2348513 | ||
|
Full text: |
||
|
Click logs from search engines provide a rich opportunity to acquire implicit feedback from users. Patterns derived from the time between a posted query and a click provide information on the ranking quality, reflecting the perceived relevance of a retrieved ...
expand
|
||
| Text selections as implicit relevance feedback | ||
| Ryen W. White, Georg Buscher | ||
| Pages: 1151-1152 | ||
| doi>10.1145/2348283.2348514 | ||
|
Full text: |
||
|
Users' search activity has been used as implicit feedback to model search interests and improve the performance of search systems. In search engines, this behavior usually takes the form of queries and result clicks. However, richer data on how people ...
expand
|
||
| Time to judge relevance as an indicator of assessor error | ||
| Mark D. Smucker, Chandra Prakash Jethani | ||
| Pages: 1153-1154 | ||
| doi>10.1145/2348283.2348515 | ||
|
Full text: |
||
|
When human assessors judge documents for their relevance to a search topic, it is possible for errors in judging to occur. As part of the analysis of the data collected from a 48 participant user study, we have discovered that when the participants made ...
expand
|
||
| Towards alias detection without string similarity: an active learning based approach | ||
| Lili Jiang, Jianyong Wang, Ping Luo, Ning An, Min Wang | ||
| Pages: 1155-1156 | ||
| doi>10.1145/2348283.2348516 | ||
|
Full text: |
||
|
Entity aliases commonly exist and accurately detecting these aliases plays a vital role in various applications. In this paper, we use an active-learning-based method to detect aliases without string similarity. To minimize the cost on pairwise comparison, ...
expand
|
||
| Towards zero-click mobile IR evaluation: knowing what and knowing when | ||
| Tetsuya Sakai | ||
| Pages: 1157-1158 | ||
| doi>10.1145/2348283.2348517 | ||
|
Full text: |
||
|
In this poster, we propose two evaluation tasks for mobile information access. The first task evaluates the system's ability to guess what the user's query should be given a context ("Knowing What"). The second task evaluates the system's ability to ...
expand
|
||
| Twanchor text: a preliminary study of the value of tweets as anchor text | ||
| Gilad Mishne, Jimmy Lin | ||
| Pages: 1159-1160 | ||
| doi>10.1145/2348283.2348518 | ||
|
Full text: |
||
|
It is well known that anchor text plays an important role in search, providing signals that are often not present in the source document itself. The paper reports results of a preliminary investigation on the value of tweets and tweet conversations as ...
expand
|
||
| Unsupervised linear score normalization revisited | ||
| Ilya Markov, Avi Arampatzis, Fabio Crestani | ||
| Pages: 1161-1162 | ||
| doi>10.1145/2348283.2348519 | ||
|
Full text: |
||
|
We give a fresh look into score normalization for merging result-lists, isolating the problem from other components. We focus on three of the simplest, practical, and widely-used linear methods which do not require any training data, i.e. MinMax, Sum, ...
expand
|
||
| User-aware caching and prefetching query results in web search engines | ||
| Hongyuan Ma, Bin Wang | ||
| Pages: 1163-1164 | ||
| doi>10.1145/2348283.2348520 | ||
|
Full text: |
||
|
Query results caching is an efficient technique for Web search engines. In this paper we present User-Aware Cache, a novel approach tailored for query results caching, that is based on user characteristics. We then use a trace of around 30 million queries ...
expand
|
||
| Using eye-tracking with dynamic areas of interest for analyzing interactive information retrieval | ||
| Vu Tuan Tran, Norbert Fuhr | ||
| Pages: 1165-1166 | ||
| doi>10.1145/2348283.2348521 | ||
|
Full text: |
||
|
Based on a new framework for capturing dynamic areas of interest in eye-tracking, we model the user search process as a Markov-chain. The analysis indicates possible system improvements and yields parameter estimates for the Interactive Probability Ranking ...
expand
|
||
| Using PageRank to infer user preferences | ||
| Praveen Chandar, Ben Carterette | ||
| Pages: 1167-1168 | ||
| doi>10.1145/2348283.2348522 | ||
|
Full text: |
||
|
Recently, researchers have shown interest in the use of preference judgments for evaluation in IR literature. Although preference judgments have several advantages over absolute judgment, one of the major disadvantages is that the number of judgments ...
expand
|
||
| Utilizing inter-document similarities in federated search | ||
| Savva Khalaman, Oren Kurland | ||
| Pages: 1169-1170 | ||
| doi>10.1145/2348283.2348523 | ||
|
Full text: |
||
|
We demonstrate the merits of using inter-document similarities for federated search. Specifically, we study a results merging method that utilizes information induced from clusters of similar documents created across the lists retrieved from the ...
expand
|
||
| Want a coffee?: predicting users' trails | ||
| Wen Li, Carsten Eickhoff, Arjen P. de Vries | ||
| Pages: 1171-1172 | ||
| doi>10.1145/2348283.2348524 | ||
|
Full text: |
||
|
Twitter and Foursquare are two well-connected platforms for sharing information where growing numbers of users post location-related messages. In contrast to the longitude-latitude geotags commonly used online, e.g., on photos and tweets, new place-tags ...
expand
|
||
| Will this #hashtag be popular tomorrow? | ||
| Zongyang Ma, Aixin Sun, Gao Cong | ||
| Pages: 1173-1174 | ||
| doi>10.1145/2348283.2348525 | ||
|
Full text: |
||
|
Hashtags are widely used in Twitter to define a shared context for events or topics. In this paper, we aim to predict hashtag popularity in near future (i.e., next day). Given a hashtag that has the potential to be popular in the next day, we construct ...
expand
|
||
| $100,000 prize jackpot. call now!: identifying the pertinent features of SMS spam | ||
| Henry Tan, Nazli Goharian, Micah Sherr | ||
| Pages: 1175-1176 | ||
| doi>10.1145/2348283.2348526 | ||
|
Full text: |
||
|
Mobile SMS spam is on the rise and is a prevalent problem. While recent work has shown that simple machine learning techniques can distinguish between ham and spam with high accuracy, this paper explores the individual contributions of various textual ...
expand
|
||
| TUTORIAL SESSION: Tutorial presentations | ||
| Beyond bag-of-words: machine learning for query-document matching in web search | ||
| Hang Li, Jun Xu | ||
| Pages: 1177-1177 | ||
| doi>10.1145/2348283.2348528 | ||
|
Full text: |
||
| Methods for mining and summarizing text conversations | ||
| Giuseppe Carenini, Gabrial Murray | ||
| Pages: 1178-1179 | ||
| doi>10.1145/2348283.2348529 | ||
|
Full text: |
||
|
More and more today, people are engaging in conversations via email, blogs, discussion forums, text messaging and other social media. A person may want to archive these conversations and later retrieve information about what was discussed, or analyze ...
expand
|
||
| Crowdsourcing for search evaluation and social-algorithmic search | ||
| Matthew Lease, Omar Alonso | ||
| Pages: 1180-1180 | ||
| doi>10.1145/2348283.2348530 | ||
|
Full text: |
||
|
The first computers were people. Today, Internet-based access to 24/7 online human crowds has led to a renaissance of research in human computation and the advent of crowdsourcing. These new opportunities have brought a disruptive shift to research and ...
expand
|
||
| (Big) usage data in web search | ||
| Ricardo Baeza-Yates, Yoelle Maarek | ||
| Pages: 1181-1182 | ||
| doi>10.1145/2348283.2348531 | ||
|
Full text: |
||
| A new look at old tricks: the fertile roots of current research | ||
| Paul Kantor | ||
| Pages: 1183-1183 | ||
| doi>10.1145/2348283.2348532 | ||
|
Full text: |
||
| Aspect-based opinion mining from product reviews | ||
| Samaneh Moghaddam, Martin Ester | ||
| Pages: 1184-1184 | ||
| doi>10.1145/2348283.2348533 | ||
|
Full text: |
||
|
"What other people think" has always been an important piece of information for most of us during the decision-making process. Today people tend to make their opinions available to other people via the Internet. As a result, the Web has become an excellent ...
expand
|
||
| Experimental methods for information retrieval | ||
| Donald Metzler, Oren Kurland | ||
| Pages: 1185-1186 | ||
| doi>10.1145/2348283.2348534 | ||
|
Full text: |
||
| IR models: foundations and relationships | ||
| Thomas Roelleke | ||
| Pages: 1187-1188 | ||
| doi>10.1145/2348283.2348535 | ||
|
Full text: |
||
|
In IR research it is essential to know IR models. Research over the past years has consolidated the foundations of IR models. Moreover, relationships have been reported that help to use and position IR models. Knowing about the foundations and relationships ...
expand
|
||
| Patent information retrieval: an instance of domain-specific search | ||
| Mihai Lupu | ||
| Pages: 1189-1190 | ||
| doi>10.1145/2348283.2348536 | ||
|
Full text: |
||
|
The tutorial aims to provide the IR researchers with an understanding of how the patent system works, the challenges that patent searchers face in using the existing tools and in adopting new methods developed in academia. At the same time, the tutorial ...
expand
|
||
| Medical information retrieval: an instance of domain-specific search | ||
| Allan Hanbury | ||
| Pages: 1191-1192 | ||
| doi>10.1145/2348283.2348537 | ||
|
Full text: |
||
|
Due to an explosion in the amount of medical information available, search techniques are gaining importance in the medical domain. This tutorial discusses recent results on search in the medical domain, including the outcome of surveys on end user requirements, ...
expand
|
||
| Visual information retrieval using Java and LIRE | ||
| Oge Marques, Mathias Lux | ||
| Pages: 1193-1193 | ||
| doi>10.1145/2348283.2348538 | ||
|
Full text: |
||
|
Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) form large, unstructured repositories. The goal of ...
expand
|
||
| Large-scale graph mining and learning for information retrieval | ||
| Bin Gao, Taifeng Wang, Tie-Yan Liu | ||
| Pages: 1194-1195 | ||
| doi>10.1145/2348283.2348539 | ||
|
Full text: |
||
|
For many information retrieval applications, we need to deal with the ranking problem on very large scale graphs. However, it is non-trivial to perform efficient and effective ranking on them. On one aspect, we need to design scalable algorithms. On ...
expand
|
||
| Query performance prediction for IR | ||
| David Carmel, Oren Kurland | ||
| Pages: 1196-1197 | ||
| doi>10.1145/2348283.2348540 | ||
|
Full text: |
||
|
The goal of this tutorial is to expose participants to current research on query performance prediction. Participants will become familiar with state-of-the-art performance prediction methods, with common evaluation methodologies of prediction quality, ...
expand
|
||
| Collaborative information seeking: art and science of achieving 1+1>2 in IR | ||
| Chirag Shah | ||
| Pages: 1198-1199 | ||
| doi>10.1145/2348283.2348541 | ||
|
Full text: |
||
|
The assumption of information seekers being independent and IR problem being individual has been challenged often in the recent past, with an argument that the next big leap in search and retrieval will come through incorporating social and collaborative ...
expand
|
||
| Advances on the development of evaluation measures | ||
| Ben Carterette, Evangelos Kanoulas, Emine Yilmaz | ||
| Pages: 1200-1201 | ||
| doi>10.1145/2348283.2348542 | ||
|
Full text: |
||
|
The goal of the tutorial is to provide attendees with a comprehensive overview of the latest advances in the development of information retrieval evaluation measures and discuss the current challenges in the area. A number of topics are covered, including ...
expand
|
||
We are delighted to welcome you to the 35th edition of SIGIR, the ACM International Conference on Research and Development in Information Retrieval. The conference continues its tradition of being the premier forum for research and development information retrieval, the computer science discipline behind what many call "search". The high number of submitted papers, this year again, demonstrates both the breadth and depth of the research being done in this vibrant field, both in academia and industry. We have done our best to ensure that these papers meet high standards of quality in terms of technical contribution, innovation, presentation, reference to previous work, and methodology. At the same time, we have tried to be flexible in the application of these criteria in order to consider papers describing novel and innovative work that may be somewhat unconventional.
The conference received 483 full paper submissions this year. Examining the country code of the paper's contact author, we found that 185 (38%) come from the Americas; 158 (33%), Asia and Pacific region; and 140 (29%) from Europe, the Middle East and Africa. Of these, 98 (20%) were accepted, essentially the same as last year's acceptance rate and up from the 16.7% rate of the year before. There was almost no difference in the acceptance rates across the three broad regions. The top five countries in terms of accepted papers were the U.S.A. (36), China (14), the U.K. & Spain (both 7), and the Netherlands (6). In addition, 208 short papers were submitted to the poster track, of which 76 (36.5%) were accepted. In the other categories, there were 17 (47.2%) demonstrations, 4 workshops, and 16 tutorials accepted. The top five technical areas (as inferred from the primary keyword assigned by the authors) covered by the accepted papers, were queries and query analysis (18%), retrieval models and ranking (14%), web IR & social media search (13%), document representation and content analysis (11%), and users and interactive IR (9%). This was a small re-ordering of the topics from last year.
SIGIR this year again used a two-tier double-blind reviewing approach. In a first stage, at least three reviewers read every paper and provided ratings and comments. Then, in a second stage, the primary and secondary Area Chairs ensured the quality of the reviewing process by studying, validating, and summarizing these reviews, and adding their own feedback and ratings. When required, Area Chairs initiated a discussion among the reviewers to resolve any controversial issues or significant differences of opinion. Once the discussion stage was completed, the two Area Chairs made the final decisions for nearly all submitted papers. At the program committee meeting held in Haifa, Israel, the Program Chairs and the attending Area Chairs went over the reviews, verified the process, gathered additional input, and made decisions in the few cases for which assistance had been requested.
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
|
||||||||||||||
| SESSION: Keynote address 1 | ||
| Future of the web and search | ||
| Qi Lu | ||
| Pages: 1-2 | ||
| doi>10.1145/2009916.2009918 | ||
|
Full text: |
||
|
No one doubts that we have only scratched the surface of what is possible with the Web. The day is coming fast when the Web will become almost a virtual mind reader. Your intent, interests, and needs will be instantly perceived and the information you ...
expand
|
||
| SESSION: Keynote address 2 | ||
| Beyond search: statistical topic models for text analysis | ||
| ChengXiang Zhai | ||
| Pages: 3-4 | ||
| doi>10.1145/2009916.2009920 | ||
|
Full text: |
||
|
Search is generally a means to the end of finishing a task. While the current search engines are useful to users for finding relevant information, they offer little help to users for further digesting and analyzing the overwhelming found information ...
expand
|
||
| SESSION: Users 1 | ||
| Modeling and analysis of cross-session search tasks | ||
| Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, Jaime Teevan | ||
| Pages: 5-14 | ||
| doi>10.1145/2009916.2009922 | ||
|
Full text: |
||
|
The information needs of search engine users vary in complexity, depending on the task they are trying to accomplish. Some simple needs can be satisfied with a single query, whereas others require a series of queries issued over a longer period of time. ...
expand
|
||
| The economics in interactive information retrieval | ||
| Leif Azzopardi | ||
| Pages: 15-24 | ||
| doi>10.1145/2009916.2009923 | ||
|
Full text: |
||
|
Searching is inherently an interactive process usually requiring numerous iterations of querying and assessing in order to find the desired amount of relevant information. Essentially, the search process can be viewed as a combination of inputs (queries ...
expand
|
||
| Seeding simulated queries with user-study data for personal search evaluation | ||
| David Elsweiler, David E. Losada, José C. Toucedo, Ronald T. Fernandez | ||
| Pages: 25-34 | ||
| doi>10.1145/2009916.2009924 | ||
|
Full text: |
||
|
In this paper we perform a lab-based user study (n=21) of email re-finding behaviour, examining how the characteristics of submitted queries change in different situations. A number of logistic regression models are developed on the query data to explore ...
expand
|
||
| Understanding re-finding behavior in naturalistic email interaction logs | ||
| David Elsweiler, Morgan Harvey, Martin Hacker | ||
| Pages: 35-44 | ||
| doi>10.1145/2009916.2009925 | ||
|
Full text: |
||
|
In this paper we present a longitudinal, naturalistic study of email behavior (n=47) and describe our efforts at isolating re-finding behavior in the logs through various qualitative and quantitative analyses. The presented work underlines the methodological ...
expand
|
||
| SESSION: Query analysis I | ||
| People searching for people: analysis of a people search engine log | ||
| Wouter Weerkamp, Richard Berendsen, Bogomil Kovachev, Edgar Meij, Krisztian Balog, Maarten de Rijke | ||
| Pages: 45-54 | ||
| doi>10.1145/2009916.2009927 | ||
|
Full text: |
||
|
Recent years show an increasing interest in vertical search: searching within a particular type of information. Understanding what people search for in these "verticals" gives direction to research and provides pointers for the search engines themselves. ...
expand
|
||
| Learning search tasks in queries and web pages via graph regularization | ||
| Ming Ji, Jun Yan, Siyu Gu, Jiawei Han, Xiaofei He, Wei Vivian Zhang, Zheng Chen | ||
| Pages: 55-64 | ||
| doi>10.1145/2009916.2009928 | ||
|
Full text: |
||
|
As the Internet grows explosively, search engines play a more and more important role for users in effectively accessing online information. Recently, it has been recognized that a query is often triggered by a search task that the user wants to accomplish. ...
expand
|
||
| Intentions and attention in exploratory health search | ||
| Marc-Allen Cartright, Ryen W. White, Eric Horvitz | ||
| Pages: 65-74 | ||
| doi>10.1145/2009916.2009929 | ||
|
Full text: |
||
|
We study information goals and patterns of attention in explorato-ry search for health information on the Web, reporting results of a large-scale log-based study. We examine search activity associated with the goal of diagnosing illness from symptoms ...
expand
|
||
| User behavior in zero-recall ecommerce queries | ||
| Gyanit Singh, Nish Parikh, Neel Sundaresn | ||
| Pages: 75-84 | ||
| doi>10.1145/2009916.2009930 | ||
|
Full text: |
||
|
User expectation and experience for web search and eCommerce (product) search are quite different. Product descriptions are concise as compared to typical web documents. User expectation is more specific to find the right product. The difference in the ...
expand
|
||
| SESSION: Learning to rank | ||
| Bagging gradient-boosted trees for high precision, low variance ranking models | ||
| Yasser Ganjisaffar, Rich Caruana, Cristina Videira Lopes | ||
| Pages: 85-94 | ||
| doi>10.1145/2009916.2009932 | ||
|
Full text: |
||
|
Recent studies have shown that boosting provides excellent predictive performance across a wide variety of tasks. In Learning-to-rank, boosted models such as RankBoost and LambdaMART have been shown to be among the best performing learning methods based ...
expand
|
||
| Learning to rank for freshness and relevance | ||
| Na Dai, Milad Shokouhi, Brian D. Davison | ||
| Pages: 95-104 | ||
| doi>10.1145/2009916.2009933 | ||
|
Full text: |
||
|
Freshness of results is important in modern web search. Failing to recognize the temporal aspect of a query can negatively affect the user experience, and make the search engine appear stale. While freshness and relevance can be closely related for some ...
expand
|
||
| A cascade ranking model for efficient ranked retrieval | ||
| Lidan Wang, Jimmy Lin, Donald Metzler | ||
| Pages: 105-114 | ||
| doi>10.1145/2009916.2009934 | ||
|
Full text: |
||
|
There is a fundamental tradeoff between effectiveness and efficiency when designing retrieval models for large-scale document collections. Effectiveness tends to derive from sophisticated ranking functions, such as those constructed using learning to ...
expand
|
||
| Relevant knowledge helps in choosing right teacher: active query selection for ranking adaptation | ||
| Peng Cai, Wei Gao, Aoying Zhou, Kam-Fai Wong | ||
| Pages: 115-124 | ||
| doi>10.1145/2009916.2009935 | ||
|
Full text: |
||
|
Learning to adapt in a new setting is a common challenge to our knowledge and capability. New life would be easier if we actively pursued supervision from the right mentor chosen with our relevant but limited prior knowledge. This variant principle of ...
expand
|
||
| SESSION: Personalization | ||
| SCENE: a scalable two-stage personalized news recommendation system | ||
| Lei Li, Dingding Wang, Tao Li, Daniel Knox, Balaji Padmanabhan | ||
| Pages: 125-134 | ||
| doi>10.1145/2009916.2009937 | ||
|
Full text: |
||
|
Recommending news articles has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world. Traditional news recommendation systems strive to adapt their services to individual ...
expand
|
||
| Inferring and using location metadata to personalize web search | ||
| Paul N. Bennett, Filip Radlinski, Ryen W. White, Emine Yilmaz | ||
| Pages: 135-144 | ||
| doi>10.1145/2009916.2009938 | ||
|
Full text: |
||
|
Personalization of search results offers the potential for significant improvements in Web search. Among the many observable user attributes, approximate user location is particularly simple for search engines to obtain and allows personalization even ...
expand
|
||
| Active learning to maximize accuracy vs. effort in interactive information retrieval | ||
| Aibo Tian, Matthew Lease | ||
| Pages: 145-154 | ||
| doi>10.1145/2009916.2009939 | ||
|
Full text: |
||
|
We consider an interactive information retrieval task in which the user is interested in finding several to many relevant documents with minimal effort. Given an initial document ranking, user interaction with the system produces relevance feedback (RF) ...
expand
|
||
| SESSION: Retrieval models I | ||
| CRTER: using cross terms to enhance probabilistic information retrieval | ||
| Jiashu Zhao, Jimmy Xiangji Huang, Ben He | ||
| Pages: 155-164 | ||
| doi>10.1145/2009916.2009941 | ||
|
Full text: |
||
|
Term proximity retrieval rewards a document where the matched query terms occur close to each other. Although term proximity is known to be effective in many Information Retrieval (IR) applications, the within-document distribution of each individual ...
expand
|
||
| A boosting approach to improving pseudo-relevance feedback | ||
| Yuanhua Lv, ChengXiang Zhai, Wan Chen | ||
| Pages: 165-174 | ||
| doi>10.1145/2009916.2009942 | ||
|
Full text: |
||
|
Pseudo-relevance feedback has proven effective for improving the average retrieval performance. Unfortunately, many experiments have shown that although pseudo-relevance feedback helps many queries, it also often hurts many other queries, limiting its ...
expand
|
||
| Enhancing ad-hoc relevance weighting using probability density estimation | ||
| Xiaofeng Zhou, Jimmy Xiangji Huang, Ben He | ||
| Pages: 175-184 | ||
| doi>10.1145/2009916.2009943 | ||
|
Full text: |
||
|
Classical probabilistic information retrieval (IR) models, e.g. BM25, deal with document length based on a trade-off between the Verbosity hypothesis, which assumes the independence of a document's relevance of its length, and the Scope hypothesis, which ...
expand
|
||
| SESSION: Social media | ||
| Who should share what?: item-level social influence prediction for users and posts ranking | ||
| Peng Cui, Fei Wang, Shaowei Liu, Mingdong Ou, Shiqiang Yang, Lifeng Sun | ||
| Pages: 185-194 | ||
| doi>10.1145/2009916.2009945 | ||
|
Full text: |
||
|
People and information are two core dimensions in a social network. People sharing information (such as blogs, news, albums, etc.) is the basic behavior. In this paper, we focus on predicting item-level social influence to answer the question Who should ...
expand
|
||
| Mining tags using social endorsement networks | ||
| Theodoros Lappas, Kunal Punera, Tamas Sarlos | ||
| Pages: 195-204 | ||
| doi>10.1145/2009916.2009946 | ||
|
Full text: |
||
|
Entities on social systems, such as users on Twitter, and images on Flickr, are at the core of many interesting applications: they can be ranked in search results, recommended to users, or used in contextual advertising. Such applications assume knowledge ...
expand
|
||
| Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking | ||
| Gabriella Kazai, Jaap Kamps, Marijn Koolen, Natasa Milic-Frayling | ||
| Pages: 205-214 | ||
| doi>10.1145/2009916.2009947 | ||
|
Full text: |
||
|
The evaluation of information retrieval (IR) systems over special collections, such as large book repositories, is out of reach of traditional methods that rely upon editorial relevance judgments. Increasingly, the use of crowdsourcing to collect relevance ...
expand
|
||
| SESSION: Content analysis | ||
| A site oriented method for segmenting web pages | ||
| David Fernandes, Edleno Silva de Moura, Altigran Soares da Silva, Berthier Ribeiro-Neto, Edisson Braga | ||
| Pages: 215-224 | ||
| doi>10.1145/2009916.2009949 | ||
|
Full text: |
||
|
Information about how to segment a Web page can be used nowadays by applications such as segment aware Web search, classification and link analysis. In this research, we propose a fully automatic method for page segmentation and evaluate its application ...
expand
|
||
| Composite hashing with multiple information sources | ||
| Dan Zhang, Fei Wang, Luo Si | ||
| Pages: 225-234 | ||
| doi>10.1145/2009916.2009950 | ||
|
Full text: |
||
|
Similarity search applications with a large amount of text and image data demands an efficient and effective solution. One useful strategy is to represent the examples in databases as compact binary codes through semantic hashing, which has attracted ...
expand
|
||
| Detecting outlier sections in us congressional legislation | ||
| Elif Aktolga, Irene Ros, Yannick Assogba | ||
| Pages: 235-244 | ||
| doi>10.1145/2009916.2009951 | ||
|
Full text: |
||
|
Reading congressional legislation, also known as bills, is often tedious because bills tend to be long and written in complex language. In IBM Many Bills, an interactive web-based visualization of legislation, users of different backgrounds can browse ...
expand
|
||
| DOM based content extraction via text density | ||
| Fei Sun, Dandan Song, Lejian Liao | ||
| Pages: 245-254 | ||
| doi>10.1145/2009916.2009952 | ||
|
Full text: |
||
|
In addition to the main content, most web pages also contain navigation panels, advertisements and copyright and disclaimer notices. This additional content, which is also known as noise, is typically not related to the main subject and may hamper the ...
expand
|
||
| SESSION: Web IR | ||
| Social context summarization | ||
| Zi Yang, Keke Cai, Jie Tang, Li Zhang, Zhong Su, Juanzi Li | ||
| Pages: 255-264 | ||
| doi>10.1145/2009916.2009954 | ||
|
Full text: |
||
|
We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on extracting informative sentences from standard documents. With the rapid growth of online social networks, abundant user generated ...
expand
|
||
| Probabilistic factor models for web site recommendation | ||
| Hao Ma, Chao Liu, Irwin King, Michael R. Lyu | ||
| Pages: 265-274 | ||
| doi>10.1145/2009916.2009955 | ||
|
Full text: |
||
|
Due to the prevalence of personalization and information filtering applications, modeling users' interests on the Web has become increasingly important during the past few years. In this paper, aiming at providing accurate personalized Web site recommendations ...
expand
|
||
| Efficiently collecting relevance information from clickthroughs for web retrieval system evaluation | ||
| Jing He, Wayne Xin Zhao, Baihan Shu, Xiaoming Li, Hongfei Yan | ||
| Pages: 275-284 | ||
| doi>10.1145/2009916.2009956 | ||
|
Full text: |
||
|
Various click models have been recently proposed as a principled approach to infer the relevance of documents from the clickthrough data. The inferred document relevance is potentially useful in evaluating the Web retrieval systems. In practice, it generally ...
expand
|
||
| Unsupervised query segmentation using clickthrough for information retrieval | ||
| Yanen Li, Bo-Jun Paul Hsu, ChengXiang Zhai, Kuansan Wang | ||
| Pages: 285-294 | ||
| doi>10.1145/2009916.2009957 | ||
|
Full text: |
||
|
Query segmentation is an important task toward understanding queries accurately, which is essential for improving search results. Existing segmentation models either use labeled data to predict the segmentation boundaries, for which the training data ...
expand
|
||
| SESSION: Collaborative filtering I | ||
| Collaborative competitive filtering: learning recommender using context of user choice | ||
| Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hongyuan Zha, Zhaohui Zheng | ||
| Pages: 295-304 | ||
| doi>10.1145/2009916.2009959 | ||
|
Full text: |
||
|
While a user's preference is directly reflected in the interactive choice process between her and the recommender, this wealth of information was not fully exploited for learning recommender models. In particular, existing collaborative filtering (CF) ...
expand
|
||
| CLR: a collaborative location recommendation framework based on co-clustering | ||
| Kenneth Wai-Ting Leung, Dik Lun Lee, Wang-Chien Lee | ||
| Pages: 305-314 | ||
| doi>10.1145/2009916.2009960 | ||
|
Full text: |
||
|
GPS data tracked on mobile devices contains rich information about human activities and preferences. In this paper, GPS data is used in location-based services (LBSs) to provide collaborative location recommendations. We observe that most existing LBSs ...
expand
|
||
| Functional matrix factorizations for cold-start recommendation | ||
| Ke Zhou, Shuang-Hong Yang, Hongyuan Zha | ||
| Pages: 315-324 | ||
| doi>10.1145/2009916.2009961 | ||
|
Full text: |
||
|
A key challenge in recommender system research is how to effectively profile new users, a problem generally known as cold-start recommendation. Recently the idea of progressively querying user responses through an initial interview ...
expand
|
||
| Exploiting geographical influence for collaborative point-of-interest recommendation | ||
| Mao Ye, Peifeng Yin, Wang-Chien Lee, Dik-Lun Lee | ||
| Pages: 325-334 | ||
| doi>10.1145/2009916.2009962 | ||
|
Full text: |
||
|
In this paper, we aim to provide a point-of-interests (POI) recommendation service for the rapid growing location-based social networks (LBSNs), e.g., Foursquare, Whrrl, etc. Our idea is to explore user preference, social influence and geographical influence ...
expand
|
||
| SESSION: Users II | ||
| Why searchers switch: understanding and predicting engine switching rationales | ||
| Qi Guo, Ryen W. White, Yunqiao Zhang, Blake Anderson, Susan T. Dumais | ||
| Pages: 335-344 | ||
| doi>10.1145/2009916.2009964 | ||
|
Full text: |
||
|
Search engine switching is the voluntary transition between Web search engines. Engine switching can occur for a number of reasons, including user dissatisfaction with search results, a desire for broader topic coverage or verification, user preferences, ...
expand
|
||
| Find it if you can: a game for modeling different types of web search success using interaction data | ||
| Mikhail Ageev, Qi Guo, Dmitry Lagun, Eugene Agichtein | ||
| Pages: 345-354 | ||
| doi>10.1145/2009916.2009965 | ||
|
Full text: |
||
|
A better understanding of strategies and behavior of successful searchers is crucial for improving the experience of all searchers. However, research of search behavior has been struggling with the tension between the relatively small-scale, but controlled ...
expand
|
||
| Measuring improvement in user search performance resulting from optimal search tips | ||
| Neema Moraveji, Daniel Russell, Jacob Bien, David Mease | ||
| Pages: 355-364 | ||
| doi>10.1145/2009916.2009966 | ||
|
Full text: |
||
|
Web search performance can be improved by either improving the search engine itself or by educating the user to search more efficiently. There is a large amount of literature describing techniques for measuring the former; whereas, improvements resulting ...
expand
|
||
| ViewSer: enabling large-scale remote user studies of web search examination and interaction | ||
| Dmitry Lagun, Eugene Agichtein | ||
| Pages: 365-374 | ||
| doi>10.1145/2009916.2009967 | ||
|
Full text: |
||
|
Web search behaviour studies, including eye-tracking studies of search result examination, have resulted in numerous insights to improve search result quality and presentation. Yet, eye tracking studies have been restricted in scale, due to the expense ...
expand
|
||
| SESSION: Query analysis II | ||
| CrowdLogging: distributed, private, and anonymous search logging | ||
| Henry Allen Feild, James Allan, Joshua Glatt | ||
| Pages: 375-384 | ||
| doi>10.1145/2009916.2009969 | ||
|
Full text: |
||
|
We describe CrowdLogging, an approach for distributed search log collection, storage, and mining, with the dual goals of preserving privacy and making the mined information broadly available. Most search log mining approaches and most privacy enhancing ...
expand
|
||
| Out of sight, not out of mind: on the effect of social and physical detachment on information need | ||
| Elad Yom-Tov, Fernando Diaz | ||
| Pages: 385-394 | ||
| doi>10.1145/2009916.2009970 | ||
|
Full text: |
||
|
The information needs of users and the documents which answer it are frequently contingent on the different characteristics of users. This is especially evident during natural disasters, such as earthquakes and violent weather incidents, which create ...
expand
|
||
| Scalable multi-dimensional user intent identification using tree structured distributions | ||
| Vinay Jethava, Liliana Calderón-Benavides, Ricardo Baeza-Yates, Chiranjib Bhattacharyya, Devdatt Dubhashi | ||
| Pages: 395-404 | ||
| doi>10.1145/2009916.2009971 | ||
|
Full text: |
||
|
The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often ...
expand
|
||
| Social annotation in query expansion: a machine learning approach | ||
| Yuan Lin, Hongfei Lin, Song Jin, Zheng Ye | ||
| Pages: 405-414 | ||
| doi>10.1145/2009916.2009972 | ||
|
Full text: |
||
|
Automatic query expansion technologies have been proven to be effective in many information retrieval tasks. Most existing approaches are based on the assumption that the most informative terms in top-retrieved documents can be viewed as context of the ...
expand
|
||
| SESSION: Communities | ||
| Predicting web searcher satisfaction with existing community-based answers | ||
| Qiaoling Liu, Eugene Agichtein, Gideon Dror, Evgeniy Gabrilovich, Yoelle Maarek, Dan Pelleg, Idan Szpektor | ||
| Pages: 415-424 | ||
| doi>10.1145/2009916.2009974 | ||
|
Full text: |
||
|
Community-based Question Answering (CQA) sites, such as Yahoo! Answers, Baidu Knows, Naver, and Quora, have been rapidly growing in popularity. The resulting archives of posted answers to questions, in Yahoo! Answers alone, already exceed in size 1 billion, ...
expand
|
||
| Competition-based user expertise score estimation | ||
| Jing Liu, Young-In Song, Chin-Yew Lin | ||
| Pages: 425-434 | ||
| doi>10.1145/2009916.2009975 | ||
|
Full text: |
||
|
In this paper, we consider the problem of estimating the relative expertise score of users in community question and answering services (CQA). Previous approaches typically only utilize the explicit question answering relationship between askers and ...
expand
|
||
| Learning online discussion structures by conditional random fields | ||
| Hongning Wang, Chi Wang, ChengXiang Zhai, Jiawei Han | ||
| Pages: 435-444 | ||
| doi>10.1145/2009916.2009976 | ||
|
Full text: |
||
|
Online forum discussions are emerging as valuable information repository, where knowledge is accumulated by the interaction among users, leading to multiple threads with structures. Such replying structure in each thread conveys important information ...
expand
|
||
| Mining topics on participations for community discovery | ||
| Guoqing Zheng, Jinwen Guo, Lichun Yang, Shengliang Xu, Shenghua Bao, Zhong Su, Dingyi Han, Yong Yu | ||
| Pages: 445-454 | ||
| doi>10.1145/2009916.2009977 | ||
|
Full text: |
||
|
Community discovery on large-scale linked document corpora has been a hot research topic for decades. There are two types of links. The first one, which we call d2d-link, indicates connectiveness among different documents, such as blog references and ...
expand
|
||
| SESSION: Classification | ||
| Authorship classification: a discriminative syntactic tree mining approach | ||
| Sangkyum Kim, Hyungsul Kim, Tim Weninger, Jiawei Han, Hyun Duk Kim | ||
| Pages: 455-464 | ||
| doi>10.1145/2009916.2009979 | ||
|
Full text: |
||
|
In the past, there have been dozens of studies on automatic authorship classification, and many of these studies concluded that the writing style is one of the best indicators for original authorship. From among the hundreds of features which were developed, ...
expand
|
||
| On theme location discovery for travelogue services | ||
| Mao Ye, Rong Xiao, Wang-Chien Lee, Xing Xie | ||
| Pages: 465-474 | ||
| doi>10.1145/2009916.2009980 | ||
|
Full text: |
||
|
In this paper, we aim to develop a travelogue service that discovers and conveys various travelogue digests, in form of theme locations, geographical scope, traveling trajectory and location snippet, to users. In this service, theme locations in a travelogue ...
expand
|
||
| Effective sentiment stream analysis with self-augmenting training and demand-driven projection | ||
| Ismael Santana Silva, Janaína Gomide, Adriano Veloso, Wagner Meira, Jr., Renato Ferreira | ||
| Pages: 475-484 | ||
| doi>10.1145/2009916.2009981 | ||
|
Full text: |
||
|
How do we analyze sentiments over a set of opinionated Twitter messages? This issue has been widely studied in recent years, with a prominent approach being based on the application of classification techniques. Basically, messages are classified according ...
expand
|
||
| SESSION: Retrieval models II | ||
| Hypergeometric language models for republished article finding | ||
| Manos Tsagkias, Maarten de Rijke, Wouter Weerkamp | ||
| Pages: 485-494 | ||
| doi>10.1145/2009916.2009983 | ||
|
Full text: |
||
|
Republished article finding is the task of identifying instances of articles that have been published in one source and republished more or less verbatim in another source, which is often a social media source. We address this task as an ad hoc retrieval ...
expand
|
||
| Estimation methods for ranking recent information | ||
| Miles Efron, Gene Golovchinsky | ||
| Pages: 495-504 | ||
| doi>10.1145/2009916.2009984 | ||
|
Full text: |
||
|
Temporal aspects of documents can impact relevance for certain kinds of queries. In this paper, we build on earlier work of modeling temporal information. We propose an extension to the Query Likelihood Model that incorporates query-specific information ...
expand
|
||
| Query by document via a decomposition-based two-level retrieval approach | ||
| Linkai Weng, Zhiwei Li, Rui Cai, Yaoxue Zhang, Yuezhi Zhou, Laurence T. Yang, Lei Zhang | ||
| Pages: 505-514 | ||
| doi>10.1145/2009916.2009985 | ||
|
Full text: |
||
|
Retrieving similar documents from a large-scale text corpus according to a given document is a fundamental technique for many applications. However, most of existing indexing techniques have difficulties to address this problem due to special properties ...
expand
|
||
| SESSION: Image search | ||
| Integrating hierarchical feature selection and classifier training for multi-label image annotation | ||
| Cheng Jin, Chunlei Yang | ||
| Pages: 515-524 | ||
| doi>10.1145/2009916.2009987 | ||
|
Full text: |
||
|
It is well accepted that using high-dimensional multi-modal visual features for image content representation and classifier training may achieve more sufficient characterization of the diverse visual properties of the images and further result in higher ...
expand
|
||
| Efficient manifold ranking for image retrieval | ||
| Bin Xu, Jiajun Bu, Chun Chen, Deng Cai, Xiaofei He, Wei Liu, Jiebo Luo | ||
| Pages: 525-534 | ||
| doi>10.1145/2009916.2009988 | ||
|
Full text: |
||
|
Manifold Ranking (MR), a graph-based ranking algorithm, has been widely applied in information retrieval and shown to have excellent performance and feasibility on a variety of data types. Particularly, it has been successfully applied to content-based ...
expand
|
||
| Mining weakly labeled web facial images for search-based face annotation | ||
| Dayong Wang, Steven C.H. Hoi, Ying He | ||
| Pages: 535-544 | ||
| doi>10.1145/2009916.2009989 | ||
|
Full text: |
||
|
In this paper, we investigate a search-based face annotation framework by mining weakly labeled facial images that are freely available on the internet. A key component of such a search-based annotation paradigm is to build a database of facial images ...
expand
|
||
| SESSION: Indexing | ||
| Temporal index sharding for space-time efficiency in archive search | ||
| Avishek Anand, Srikanta Bedathur, Klaus Berberich, Ralf Schenkel | ||
| Pages: 545-554 | ||
| doi>10.1145/2009916.2009991 | ||
|
Full text: |
||
|
Time-travel queries that couple temporal constraints with keyword queries are useful in searching large-scale archives of time-evolving content such as the web archives or wikis. Typical approaches for efficient evaluation of these queries involve slicing ...
expand
|
||
| Inverted indexes for phrases and strings | ||
| Manish Patil, Sharma V. Thankachan, Rahul Shah, Wing-Kai Hon, Jeffrey Scott Vitter, Sabrina Chandrasekaran | ||
| Pages: 555-564 | ||
| doi>10.1145/2009916.2009992 | ||
|
Full text: |
||
|
Inverted indexes are the most fundamental and widely used data structures in information retrieval. For each unique word occurring in a document collection, the inverted index stores a list of the documents in which this word occurs. Compression techniques ...
expand
|
||
| Faster temporal range queries over versioned text | ||
| Jinru He, Torsten Suel | ||
| Pages: 565-574 | ||
| doi>10.1145/2009916.2009993 | ||
|
Full text: |
||
|
Versioned textual collections are collections that retain multiple versions of a document as it evolves over time. Important large-scale examples are Wikipedia and the web collection of the Internet Archive. Search queries over such collections often ...
expand
|
||
| Indexing strategies for graceful degradation of search quality | ||
| Shuai Ding, Sreenivas Gollapudi, Samuel Ieong, Krishnaram Kenthapadi, Alexandros Ntoulas | ||
| Pages: 575-584 | ||
| doi>10.1145/2009916.2009994 | ||
|
Full text: |
||
|
Large web search engines process billions of queries each day over tens of billions of documents with often very stringent requirements for a user's search experience, in particular, low latency and highly relevant search results. Index generation and ...
expand
|
||
| SESSION: Web queries | ||
| Incremental diversification for very large sets: a streaming-based approach | ||
| Enrico Minack, Wolf Siberski, Wolfgang Nejdl | ||
| Pages: 585-594 | ||
| doi>10.1145/2009916.2009996 | ||
|
Full text: |
||
|
Result diversification is an effective method to reduce the risk that none of the returned results satisfies a user's query intention. It has been shown to decrease query abandonment substantially. On the other hand, computing an optimally diverse set ...
expand
|
||
| Intent-aware search result diversification | ||
| Rodrygo L.T. Santos, Craig Macdonald, Iadh Ounis | ||
| Pages: 595-604 | ||
| doi>10.1145/2009916.2009997 | ||
|
Full text: |
||
|
Search result diversification has gained momentum as a way to tackle ambiguous queries. An effective approach to this problem is to explicitly model the possible aspects underlying a query, in order to maximise the estimated relevance of the retrieved ...
expand
|
||
| Parameterized concept weighting in verbose queries | ||
| Michael Bendersky, Donald Metzler, W. Bruce Croft | ||
| Pages: 605-614 | ||
| doi>10.1145/2009916.2009998 | ||
|
Full text: |
||
|
The majority of the current information retrieval models weight the query concepts (e.g., terms or phrases) in an unsupervised manner, based solely on the collection statistics. In this paper, we go beyond the unsupervised estimation of concept weights, ...
expand
|
||
| UPS: efficient privacy protection in personalized web search | ||
| Gang Chen, He Bai, Lidan Shou, Ke Chen, Yunjun Gao | ||
| Pages: 615-624 | ||
| doi>10.1145/2009916.2009999 | ||
|
Full text: |
||
|
In recent years, personalized web search (PWS) has demonstrated effectiveness in improving the quality of search service on the Internet. Unfortunately, the need for collecting private information in PWS has become a major barrier for its wide proliferation. ...
expand
|
||
| SESSION: Collaborative filtering II | ||
| Handling data sparsity in collaborative filtering using emotion and semantic based features | ||
| Yashar Moshfeghi, Benjamin Piwowarski, Joemon M. Jose | ||
| Pages: 625-634 | ||
| doi>10.1145/2009916.2010001 | ||
|
Full text: |
||
|
Collaborative filtering (CF) aims to recommend items based on prior user interaction. Despite their success, CF techniques do not handle data sparsity well, especially in the case of the cold start problem where there is no past rating for an item. In ...
expand
|
||
| Fast context-aware recommendations with factorization machines | ||
| Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, Lars Schmidt-Thieme | ||
| Pages: 635-644 | ||
| doi>10.1145/2009916.2010002 | ||
|
Full text: |
||
|
The situation in which a choice is made is an important information for recommender systems. Context-aware recommenders take this information into account to make predictions. So far, the best performing method for context-aware rating prediction in ...
expand
|
||
| Filtering semi-structured documents based on faceted feedback | ||
| Lanbo Zhang, Yi Zhang, Qianli Xing | ||
| Pages: 645-654 | ||
| doi>10.1145/2009916.2010003 | ||
|
Full text: |
||
|
Existing adaptive filtering systems learn user profiles based on users' relevance judgments on documents. In some cases, users have some prior knowledge about what features are important for a document to be relevant. For example, a Spanish speaker may ...
expand
|
||
| Learning relevance from heterogeneous social network and its application in online targeting | ||
| Chi Wang, Rajat Raina, David Fong, Ding Zhou, Jiawei Han, Greg Badros | ||
| Pages: 655-664 | ||
| doi>10.1145/2009916.2010004 | ||
|
Full text: |
||
|
The rise of social networking services in recent years presents new research challenges for matching users with interesting content. While the content-rich nature of these social networks offers many cues on "interests" of a user such as text in user-generated ...
expand
|
||
| SESSION: Latent semantic analysis | ||
| ILDA: interdependent LDA model for learning latent aspects and their ratings from online product reviews | ||
| Samaneh Moghaddam, Martin Ester | ||
| Pages: 665-674 | ||
| doi>10.1145/2009916.2010006 | ||
|
Full text: |
||
|
Today, more and more product reviews become available on the Internet, e.g., product review forums, discussion groups, and Blogs. However, it is almost impossible for a customer to read all of the different and possibly even contradictory opinions and ...
expand
|
||
| Clickthrough-based latent semantic models for web search | ||
| Jianfeng Gao, Kristina Toutanova, Wen-tau Yih | ||
| Pages: 675-684 | ||
| doi>10.1145/2009916.2010007 | ||
|
Full text: |
||
|
This paper presents two new document ranking models for Web search based upon the methods of semantic representation and the statistical translation-based approach to information retrieval (IR). Assuming that a query is parallel to the titles of the ...
expand
|
||
| Regularized latent semantic indexing | ||
| Quan Wang, Jun Xu, Hang Li, Nick Craswell | ||
| Pages: 685-694 | ||
| doi>10.1145/2009916.2010008 | ||
|
Full text: |
||
|
Topic modeling can boost the performance of information retrieval, but its real-world application is limited due to scalability issues. Scaling to larger document collections via parallelization is an active area of research, but most solutions require ...
expand
|
||
| SESSION: Multimedia IR | ||
| Multimedia answering: enriching text QA with media information | ||
| Liqiang Nie, Meng Wang, Zhengjun Zha, Guangda Li, Tat-Seng Chua | ||
| Pages: 695-704 | ||
| doi>10.1145/2009916.2010010 | ||
|
Full text: |
||
|
Existing community question-answering forums usually provide only textual answers. However, for many questions, pure texts cannot provide intuitive information, while image or video contents are more appropriate. In this paper, we introduce a scheme ...
expand
|
||
| Enhancing multi-label music genre classification through ensemble techniques | ||
| Chris Sanden, John Z. Zhang | ||
| Pages: 705-714 | ||
| doi>10.1145/2009916.2010011 | ||
|
Full text: |
||
|
In the field of Music Information Retrieval (MIR), multi-label genre classification is the problem of assigning one or more genre labels to a music piece. In this work, we propose a set of ensemble techniques, which are specific to the task of multi-label ...
expand
|
||
| Picasso - to sing, you must close your eyes and draw | ||
| Aleksandar Stupar, Sebastian Michel | ||
| Pages: 715-724 | ||
| doi>10.1145/2009916.2010012 | ||
|
Full text: |
||
|
We study the problem of automatically assigning appropriate music pieces to a picture or, in general, series of pictures. This task, commonly referred to as soundtrack suggestion, is non-trivial as it requires a lot of human attention and a good deal ...
expand
|
||
| SESSION: Summarization | ||
| Enhanced results for web search | ||
| Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco | ||
| Pages: 725-734 | ||
| doi>10.1145/2009916.2010014 | ||
|
Full text: |
||
|
"Ten blue links" have defined web search results for the last fifteen years -- snippets of text combined with document titles and URLs. In this paper, we establish the notion of enhanced search results that extend web search results to include multimedia ...
expand
|
||
| Summarizing the differences in multilingual news | ||
| Xiaojun Wan, Houping Jia, Shanshan Huang, Jianguo Xiao | ||
| Pages: 735-744 | ||
| doi>10.1145/2009916.2010015 | ||
|
Full text: |
||
|
There usually exist many news articles written in different languages about a hot news event. The news articles in different languages are written in different ways to reflect different standpoints. For example, the Chinese news agencies and the Western ...
expand
|
||
| Evolutionary timeline summarization: a balanced optimization framework via iterative substitution | ||
| Rui Yan, Xiaojun Wan, Jahna Otterbacher, Liang Kong, Xiaoming Li, Yan Zhang | ||
| Pages: 745-754 | ||
| doi>10.1145/2009916.2010016 | ||
|
Full text: |
||
|
Classic news summarization plays an important role with the exponential document growth on the Web. Many approaches are proposed to generate summaries but seldom simultaneously consider evolutionary characteristics of news plus to traditional summary ...
expand
|
||
| SESSION: Vertical & entity search | ||
| Ranking related news predictions | ||
| Nattiya Kanhabua, Roi Blanco, Michael Matthews | ||
| Pages: 755-764 | ||
| doi>10.1145/2009916.2010018 | ||
|
Full text: |
||
|
We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this ...
expand
|
||
| Collective entity linking in web text: a graph-based method | ||
| Xianpei Han, Le Sun, Jun Zhao | ||
| Pages: 765-774 | ||
| doi>10.1145/2009916.2010019 | ||
|
Full text: |
||
|
Entity Linking (EL) is the task of linking name mentions in Web text with their referent entities in a knowledge base. Traditional EL methods usually link name mentions in a document by assuming them to be independent. However, there is often additional ...
expand
|
||
| From one tree to a forest: a unified solution for structured web data extraction | ||
| Qiang Hao, Rui Cai, Yanwei Pang, Lei Zhang | ||
| Pages: 775-784 | ||
| doi>10.1145/2009916.2010020 | ||
|
Full text: |
||
|
Structured data, in the form of entities and associated attributes, has been a rich web resource for search engines and knowledge databases. To efficiently extract structured data from enormous websites in various verticals (e.g., books, restaurants), ...
expand
|
||
| Improving local search ranking through external logs | ||
| Klaus Berberich, Arnd Christian König, Dimitrios Lymberopoulos, Peixiang Zhao | ||
| Pages: 785-794 | ||
| doi>10.1145/2009916.2010021 | ||
|
Full text: |
||
|
The signals used for ranking in local search are very different from web search: in addition to (textual) relevance, measures of (geographic) distance between the user and the search result, as well as measures of popularity of the result are important ...
expand
|
||
| SESSION: Query suggestions | ||
| Query suggestions in the absence of query logs | ||
| Sumit Bhatia, Debapriyo Majumdar, Prasenjit Mitra | ||
| Pages: 795-804 | ||
| doi>10.1145/2009916.2010023 | ||
|
Full text: |
||
|
After an end-user has partially input a query, intelligent search engines can suggest possible completions of the partial query to help end-users quickly express their information needs. All major web-search engines and most proposed methods that suggest ...
expand
|
||
| Synthesizing high utility suggestions for rare web search queries | ||
| Alpa Jain, Umut Ozertem, Emre Velipasaoglu | ||
| Pages: 805-814 | ||
| doi>10.1145/2009916.2010024 | ||
|
Full text: |
||
|
Search engines are continuously looking into methods to alleviate users' effort in finding desired information. For this, all major search engines employ query suggestions methods to facilitate effective query formulation and reformulation. Providing ...
expand
|
||
| Post-ranking query suggestion by diversifying search results | ||
| Yang Song, Dengyong Zhou, Li-wei He | ||
| Pages: 815-824 | ||
| doi>10.1145/2009916.2010025 | ||
|
Full text: |
||
|
Query suggestion refers to the process of suggesting related queries to search engine users. Most existing researches have focused on improving the relevance of suggested queries. In this paper, we introduce the concept of diversifying the content of ...
expand
|
||
| Automatic boolean query suggestion for professional search | ||
| Youngho Kim, Jangwon Seo, W. Bruce Croft | ||
| Pages: 825-834 | ||
| doi>10.1145/2009916.2010026 | ||
|
Full text: |
||
|
In professional search environments, such as patent search or legal search, search tasks have unique characteristics: 1) users interactively issue several queries for a topic, and 2) users are willing to examine many retrieval results, i.e., there is ...
expand
|
||
| SESSION: Linguistic analysis | ||
| Improved video categorization from text metadata and user comments | ||
| Katja Filippova, Keith B. Hall | ||
| Pages: 835-842 | ||
| doi>10.1145/2009916.2010028 | ||
|
Full text: |
||
|
We consider the task of assigning categories (e.g., howto/cooking, sports/basketball, pet/dogs) to YouTube videos from video and text signals. We show that two complementary views on the data -- from the video and text perspectives -- complement each ...
expand
|
||
| Multifaceted toponym recognition for streaming news | ||
| Michael D. Lieberman, Hanan Samet | ||
| Pages: 843-852 | ||
| doi>10.1145/2009916.2010029 | ||
|
Full text: |
||
|
News sources on the Web generate constant streams of information, describing many aspects of the events that shape our world. In particular, geography plays a key role in the news, and enabling geographic retrieval of news articles involves recognizing ...
expand
|
||
| Enriching document representation via translation for improved monolingual information retrieval | ||
| Seung-Hoon Na, Hwee Tou Ng | ||
| Pages: 853-862 | ||
| doi>10.1145/2009916.2010030 | ||
|
Full text: |
||
|
Word ambiguity and vocabulary mismatch are critical problems in information retrieval. To deal with these problems, this paper proposes the use of translated words to enrich document representation, going beyond the words in the original source language ...
expand
|
||
| A novel corpus-based stemming algorithm using co-occurrence statistics | ||
| Jiaul H. Paik, Dipasree Pal, Swapan K. Parui | ||
| Pages: 863-872 | ||
| doi>10.1145/2009916.2010031 | ||
|
Full text: |
||
|
We present a stemming algorithm for text retrieval. The algorithm uses the statistics collected on the basis of certain corpus analysis based on the co-occurrence between two word variants. We use a very simple co-occurrence measure that reflects how ...
expand
|
||
| SESSION: Clustering | ||
| Document clustering with universum | ||
| Dan Zhang, Jingdong Wang, Luo Si | ||
| Pages: 873-882 | ||
| doi>10.1145/2009916.2010033 | ||
|
Full text: |
||
|
Document clustering is a popular research topic, which aims to partition documents into groups of similar objects (i.e., clusters), and has been widely used in many applications such as automatic topic extraction, document organization and filtering. ...
expand
|
||
| Identifying points of interest by self-tuning clustering | ||
| Yiyang Yang, Zhiguo Gong, Leong Hou U | ||
| Pages: 883-892 | ||
| doi>10.1145/2009916.2010034 | ||
|
Full text: |
||
|
Deducing trip related information from web-scale datasets has received very large amounts of attention recently. Identifying points of interest (POIs) in geo-tagged photos is one of these problems. The problem can be viewed as a standard clustering problem ...
expand
|
||
| Cluster-based fusion of retrieved lists | ||
| Anna Khudyak Kozorovitsky, Oren Kurland | ||
| Pages: 893-902 | ||
| doi>10.1145/2009916.2010035 | ||
|
Full text: |
||
|
Methods for fusing document lists that were retrieved in response to a query often use retrieval scores (or ranks) of documents in the lists. We present a novel probabilistic fusion approach that utilizes an additional source of rich information, namely, ...
expand
|
||
| SESSION: Effectiveness | ||
| System effectiveness, user models, and user utility: a conceptual framework for investigation | ||
| Ben Carterette | ||
| Pages: 903-912 | ||
| doi>10.1145/2009916.2010037 | ||
|
Full text: |
||
|
There is great interest in producing effectiveness measures that model user behavior in order to better model the utility of a system to its users. These measures are often formulated as a sum over the product of a discount function of ranks and a gain ...
expand
|
||
| Evaluating the synergic effect of collaboration in information seeking | ||
| Chirag Shah, Roberto González-Ibáñez | ||
| Pages: 913-922 | ||
| doi>10.1145/2009916.2010038 | ||
|
Full text: |
||
|
It is typically expected that when people work together, they can often accomplish goals that are difficult or even impossible for individuals. We consider this notion of the group achieving more than the sum of all individuals' achievements to be the ...
expand
|
||
| Repeatable and reliable search system evaluation using crowdsourcing | ||
| Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S. Thompson, Thanh Tran Duc | ||
| Pages: 923-932 | ||
| doi>10.1145/2009916.2010039 | ||
|
Full text: |
||
|
The primary problem confronting any new kind of search task is how to boot-strap a reliable and repeatable evaluation campaign, and a crowd-sourcing approach provides many advantages. However, can these crowd-sourced evaluations be repeated over long ...
expand
|
||
| SESSION: Multilingual IR | ||
| Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization | ||
| Hua Wang, Heng Huang, Feiping Nie, Chris Ding | ||
| Pages: 933-942 | ||
| doi>10.1145/2009916.2010041 | ||
|
Full text: |
||
|
The lack of sufficient labeled Web pages in many languages, especially for those uncommonly used ones, presents a great challenge to traditional supervised classification methods to achieve satisfactory Web page classification performance. To address ...
expand
|
||
| No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity | ||
| Ferhan Ture, Tamer Elsayed, Jimmy Lin | ||
| Pages: 943-952 | ||
| doi>10.1145/2009916.2010042 | ||
|
Full text: |
||
|
This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this problem are of general interest for text mining in the multi-lingual context ...
expand
|
||
| An event-centric model for multilingual document similarity | ||
| Jannik Strötgen, Michael Gertz, Conny Junghans | ||
| Pages: 953-962 | ||
| doi>10.1145/2009916.2010043 | ||
|
Full text: |
||
|
Document similarity measures play an important role in many document retrieval and exploration tasks. Over the past decades, several models and techniques have been developed to determine a ranked list of documents similar to a given query document. ...
expand
|
||
| SESSION: Efficiency | ||
| Posting list intersection on multicore architectures | ||
| Shirish Tatikonda, B. Barla Cambazoglu, Flavio P. Junqueira | ||
| Pages: 963-972 | ||
| doi>10.1145/2009916.2010045 | ||
|
Full text: |
||
|
In current commercial Web search engines, queries are processed in the conjunctive mode, which requires the search engine to compute the intersection of a number of posting lists to determine the documents matching all query terms. In practice, the intersection ...
expand
|
||
| Timestamp-based result cache invalidation for web search engines | ||
| Sadiye Alici, Ismail Sengor Altingovde, Rifat Ozcan, Berkant Barla Cambazoglu, Özgür Ulusoy | ||
| Pages: 973-982 | ||
| doi>10.1145/2009916.2010046 | ||
|
Full text: |
||
|
The result cache is a vital component for efficiency of large-scale web search engines, and maintaining the freshness of cached query results is the current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify ...
expand
|
||
| Energy-price-driven query processing in multi-center web search engines | ||
| Enver Kayaaslan, B. Barla Cambazoglu, Roi Blanco, Flavio P. Junqueira, Cevdet Aykanat | ||
| Pages: 983-992 | ||
| doi>10.1145/2009916.2010047 | ||
|
Full text: |
||
|
Concurrently processing thousands of web queries, each with a response time under a fraction of a second, necessitates maintaining and operating massive data centers. For large-scale web search engines, this translates into high energy consumption and ...
expand
|
||
| Faster top-k document retrieval using block-max indexes | ||
| Shuai Ding, Torsten Suel | ||
| Pages: 993-1002 | ||
| doi>10.1145/2009916.2010048 | ||
|
Full text: |
||
|
Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization techniques called early termination achieves faster query processing by ...
expand
|
||
| SESSION: Recommender systems | ||
| Utilizing marginal net utility for recommendation in e-commerce | ||
| Jian Wang, Yi Zhang | ||
| Pages: 1003-1012 | ||
| doi>10.1145/2009916.2010050 | ||
|
Full text: |
||
|
Traditional recommendation algorithms often select products with the highest predicted ratings to recommend. However, earlier research in economics and marketing indicates that a consumer usually makes purchase decision(s) based on the product's marginal ...
expand
|
||
| Recommending ephemeral items at web scale | ||
| Ye Chen, John F. Canny | ||
| Pages: 1013-1022 | ||
| doi>10.1145/2009916.2010051 | ||
|
Full text: |
||
|
We describe an innovative and scalable recommendation system successfully deployed at eBay. To build recommenders for long-tail marketplaces requires projection of volatile items into a persistent space of latent products. We first present a generative ...
expand
|
||
| A unified framework for recommendations based on quaternary semantic analysis | ||
| Chen Wei, Wynne Hsu, Mong Li Lee | ||
| Pages: 1023-1032 | ||
| doi>10.1145/2009916.2010052 | ||
|
Full text: |
||
|
Social network systems such as FaceBook and YouTube have played a significant role in capturing both explicit and implicit user preferences for different items in the form of ratings and tags. This forms a quaternary relationship among users, items, ...
expand
|
||
| Associative tag recommendation exploiting multiple textual features | ||
| Fabiano Belém, Eder Martins, Tatiana Pontes, Jussara Almeida, Marcos Gonçalves | ||
| Pages: 1033-1042 | ||
| doi>10.1145/2009916.2010053 | ||
|
Full text: |
||
|
This work addresses the task of recommending relevant tags to a target object by jointly exploiting three dimensions of the problem: (i) term co-occurrence with tags pre-assigned to the target object, (ii) terms extracted from multiple textual features, ...
expand
|
||
| SESSION: Test collections | ||
| Evaluating diversified search results using per-intent graded relevance | ||
| Tetsuya Sakai, Ruihua Song | ||
| Pages: 1043-1052 | ||
| doi>10.1145/2009916.2010055 | ||
|
Full text: |
||
|
Search queries are often ambiguous and/or underspecified. To accomodate different user needs, search result diversification has received attention in the past few years. Accordingly, several new metrics for evaluating diversification have been proposed, ...
expand
|
||
| Evaluating multi-query sessions | ||
| Evangelos Kanoulas, Ben Carterette, Paul D. Clough, Mark Sanderson | ||
| Pages: 1053-1062 | ||
| doi>10.1145/2009916.2010056 | ||
|
Full text: |
||
|
The standard system-based evaluation paradigm has focused on assessing the performance of retrieval systems in serving the best results for a single query. Real users, however, often begin an interaction with a search engine with a sufficiently under-specified ...
expand
|
||
| Quantifying test collection quality based on the consistency of relevance judgements | ||
| Falk Scholer, Andrew Turpin, Mark Sanderson | ||
| Pages: 1063-1072 | ||
| doi>10.1145/2009916.2010057 | ||
|
Full text: |
||
|
Relevance assessments are a key component for test collection-based evaluation of information retrieval systems. This paper reports on a feature of such collections that is used as a form of ground truth data to allow analysis of human assessment error. ...
expand
|
||
| Pseudo test collections for learning web search ranking functions | ||
| Nima Asadi, Donald Metzler, Tamer Elsayed, Jimmy Lin | ||
| Pages: 1073-1082 | ||
| doi>10.1145/2009916.2010058 | ||
|
Full text: |
||
|
Test collections are the primary drivers of progress in information retrieval. They provide yardsticks for assessing the effectiveness of ranking functions in an automatic, rapid, and repeatable fashion and serve as training data for learning to rank ...
expand
|
||
| POSTER SESSION: Posters presentations | ||
| Parallel learning to rank for information retrieval | ||
| Shuaiqiang Wang, Byron J. Gao, Ke Wang, Hady W. Lauw | ||
| Pages: 1083-1084 | ||
| doi>10.1145/2009916.2010060 | ||
|
Full text: |
||
|
Learning to rank represents a category of effective ranking methods for information retrieval. While the primary concern of existing research has been accuracy, learning efficiency is becoming an important issue due to the unprecedented availability ...
expand
|
||
| Learning features through feedback for blog distillation | ||
| Dehong Gao, Renxian Zhang, Wenjie Li, Yiu Keung Lau, Kam Fai Wong | ||
| Pages: 1085-1086 | ||
| doi>10.1145/2009916.2010061 | ||
|
Full text: |
||
|
The paper is focused on blogosphere research based on the TREC blog distillation task, and aims to explore unbiased and significant features automatically and efficiently. Feedback from faceted feeds is introduced to harvest relevant features and information ...
expand
|
||
| Time-based relevance models | ||
| Mostafa Keikha, Shima Gerani, Fabio Crestani | ||
| Pages: 1087-1088 | ||
| doi>10.1145/2009916.2010062 | ||
|
Full text: |
||
|
This paper addresses blog feed retrieval where the goal is to retrieve the most relevant blog feeds for a given user query. Since the retrieval unit is a blog, as a collection of posts, performing relevance feedback techniques and selecting the most ...
expand
|
||
| Improved query performance prediction using standard deviation | ||
| Ronan Cummins, Joemon Jose, Colm O'Riordan | ||
| Pages: 1089-1090 | ||
| doi>10.1145/2009916.2010063 | ||
|
Full text: |
||
|
Query performance prediction (QPP) is an important task in information retrieval (IR). In this paper, we (1) develop a new predictor based on the standard deviation of scores in a variable length ranked list, and (2) we show that this new predictor outperforms ...
expand
|
||
| Learning to rank using query-level regression | ||
| Jiajin Wu, Zhihao Yang, Yuan Lin, Hongfei Lin, Zheng Ye, Kan Xu | ||
| Pages: 1091-1092 | ||
| doi>10.1145/2009916.2010064 | ||
|
Full text: |
||
|
In this paper, we use query-level regression as the loss function. The regression loss function has been used in pointwise methods, however pointwise methods ignore the query boundaries and treat the data equally across queries, and thus the effectiveness ...
expand
|
||
| Diversifying product search results | ||
| Xiangru Chen, Haofen Wang, Xinruo Sun, Junfeng Pan, Yong Yu | ||
| Pages: 1093-1094 | ||
| doi>10.1145/2009916.2010065 | ||
|
Full text: |
||
|
In recent years, online shopping is becoming more and more popular. Users type keyword queries on product search systems to find relevant products, accessories, and even related products. However, existing product search systems always return very similar ...
expand
|
||
| Ad hoc IR: not much room for improvement | ||
| Andrew Trotman, David Keeler | ||
| Pages: 1095-1096 | ||
| doi>10.1145/2009916.2010066 | ||
|
Full text: |
||
|
Ranking function performance reached a plateau in 1994. The reason for this is investigated. First the performance of BM25 is measured as the proportion of queries satisfied on the first page of 10 results -- it performs well. The performance is then ...
expand
|
||
| Image annotation based on recommendation model | ||
| Zijia Lin, Guiguang Ding, Jianmin Wang | ||
| Pages: 1097-1098 | ||
| doi>10.1145/2009916.2010067 | ||
|
Full text: |
||
|
In this paper, a novel approach based on recommendation model is proposed for automatic image annotation. For any to-be-annotated image, we first select some related images with tags from training dataset according to their visual similarity. And then ...
expand
|
||
| Utilizing minimal relevance feedback for ad hoc retrieval | ||
| Eyal Krikon, Oren Kurland | ||
| Pages: 1099-1100 | ||
| doi>10.1145/2009916.2010068 | ||
|
Full text: |
||
|
Using relevance feedback can significantly improve (ad hoc) retrieval effectiveness. Yet, if little feedback is available, effectively exploiting it is a challenge. To that end, we present a novel approach that utilizes document passages. Empirical evaluation ...
expand
|
||
| Sense discrimination for physics retrieval | ||
| Christina Lioma, Alok Kothari, Hinrich Schuetze | ||
| Pages: 1101-1102 | ||
| doi>10.1145/2009916.2010069 | ||
|
Full text: |
||
|
Information Retrieval in technical domains like physics is characterised by long and precise queries, whose meaning is strongly influenced by term context and domain. We treat this as a disambiguation problem, and present initial findings of a retrieval ...
expand
|
||
| When documents are very long, BM25 fails! | ||
| Yuanhua Lv, ChengXiang Zhai | ||
| Pages: 1103-1104 | ||
| doi>10.1145/2009916.2010070 | ||
|
Full text: |
||
|
We reveal that the Okapi BM25 retrieval function tends to overly penalize very long documents. To address this problem, we present a simple yet effective extension of BM25, namely BM25L, which "shifts" the term frequency normalization formula to boost ...
expand
|
||
| Location and timeliness of information sources during news events | ||
| Elad Yom-Tov, Fernando Diaz | ||
| Pages: 1105-1106 | ||
| doi>10.1145/2009916.2010071 | ||
|
Full text: |
||
|
People nowadays can obtain information on current news events through media outlets, social media, and by actively seeking information using search engines. In this paper we investigate the temporal relationship between news coverage by media outlets, ...
expand
|
||
| What deliberately degrading search quality tells us about discount functions | ||
| Paul Thomas, Timothy Jones, David Hawking | ||
| Pages: 1107-1108 | ||
| doi>10.1145/2009916.2010072 | ||
|
Full text: |
||
|
Deliberate degradation of search results is a common tool in user experiments. We degrade high-quality search results by inserting non-relevant documents at different ranks. The effect of these manipulations, on a number of commonly-used metrics, is ...
expand
|
||
| Collective topic modeling for heterogeneous networks | ||
| Hongbo Deng, Bo Zhao, Jiawei Han | ||
| Pages: 1109-1110 | ||
| doi>10.1145/2009916.2010073 | ||
|
Full text: |
||
|
In this paper, we propose a joint probabilistic topic model for simultaneously modeling the contents of multi-typed objects of a heterogeneous information network. The intuition behind our model is that different objects of the heterogeneous network ...
expand
|
||
| Graph-cut based tag enrichment | ||
| Xueming Qian, Xian-Sheng Hua | ||
| Pages: 1111-1112 | ||
| doi>10.1145/2009916.2010074 | ||
|
Full text: |
||
|
In this paper, a graph cut based tag enrichment approach is proposed. We build a graph for each image with its initial tags. The graph is with two terminals. Nodes of the graph are full connected with each other. Min-cut/max-flow algorithm is utilized ...
expand
|
||
| Personalized social query expansion using social bookmarking systems | ||
| Mohamed Reda Bouadjenek, Hakim Hacid, Mokrane Bouzeghoub, Johann Daigremont | ||
| Pages: 1113-1114 | ||
| doi>10.1145/2009916.2010075 | ||
|
Full text: |
||
|
We propose a new approach for social and personalized query expansion using social structures in the Web 2.0. While focusing on social tagging systems, the proposed approach considers (i) the semantic similarity between tags composing a query, (ii) a ...
expand
|
||
| What are the real differences of children's and adults' web search | ||
| Tatiana Gossen, Thomas Low, Andreas Nürnberger | ||
| Pages: 1115-1116 | ||
| doi>10.1145/2009916.2010076 | ||
|
Full text: |
||
|
We present first results of a logfile analysis on web search engines for children. The aim of this research is to analyse fundamental facts about how children's web search behaviour differs from that of adults. We show differences to previous results, ...
expand
|
||
| Cognitive coordinating behaviors in multitasking web search | ||
| Jia Tina Du | ||
| Pages: 1117-1118 | ||
| doi>10.1145/2009916.2010077 | ||
|
Full text: |
||
|
This paper investigates how users cognitively coordinate multitasking Web search across different information search problems. The analysis suggests that (1) multitasking is a prevalent Web search behavior including both sequential multitasking (31%) ...
expand
|
||
| Optimizing multimodal reranking for web image search | ||
| Hao Li, Meng Wang, Zhisheng Li, Zheng-Jun Zha, Jialie Shen | ||
| Pages: 1119-1120 | ||
| doi>10.1145/2009916.2010078 | ||
|
Full text: |
||
|
In this poster, we introduce a web image search reranking approach with exploring multiple modalities. Diff erent from the conventional methods that build graph with one feature set for reranking, our approach integrates multiple feature sets that describe ...
expand
|
||
| Multi-layer graph-based semi-supervised learning for large-scale image datasets using mapreduce | ||
| Wen-Yu Lee, Liang-Chi Hsieh, Guan-Long Wu, Winston Hsu, Ya-Fan Su | ||
| Pages: 1121-1122 | ||
| doi>10.1145/2009916.2010079 | ||
|
Full text: |
||
|
Semi-supervised learning is to exploit the vast amount of unlabeled data in the world. This paper proposes a scalable graph-based technique leveraging the distributed computing power of the MapReduce programming model. For a higher quality of learning, ...
expand
|
||
| Tackling class imbalance and data scarcity in literature-based gene function annotation | ||
| Mathieu Blondel, Kazuhiro Seki, Kuniaki Uehara | ||
| Pages: 1123-1124 | ||
| doi>10.1145/2009916.2010080 | ||
|
Full text: |
||
|
In recent years, a number of machine learning approaches to literature-based gene function annotation have been proposed. However, due to issues such as lack of labeled data, class imbalance and computational cost, they have usually been unable to surpass ...
expand
|
||
| Bootstrapping subjectivity detection | ||
| Valentin Jijkoun, Maarten de Rijke | ||
| Pages: 1125-1126 | ||
| doi>10.1145/2009916.2010081 | ||
|
Full text: |
||
|
We describe a method for automatically generating subjectivity clues for a specific topic and a set of (relevant) document, evaluating it on the task of classifying sentences w.r.t. subjectivity, with improvements over previous work.
expand
|
||
| The effects of choice in routing relevance judgments | ||
| Edith Law, Paul N. Bennett, Eric Horvitz | ||
| Pages: 1127-1128 | ||
| doi>10.1145/2009916.2010082 | ||
|
Full text: |
||
|
The emergence of human computation systems, including Mechanical Turk and games with a purpose, has made it feasible to distribute relevance judgment tasks to workers over the Web. Most human computation systems assign tasks to individuals randomly, ...
expand
|
||
| Statistical feature extraction for cross-language web content quality assessment | ||
| Guang-Gang Geng, Xiao-Dong Li, Li-Ming Wang, Wei Wang, Shuo Shen | ||
| Pages: 1129-1130 | ||
| doi>10.1145/2009916.2010083 | ||
|
Full text: |
||
|
Web content quality assessment is a typical static ranking problem. Heuristic content and TFIDF features based statistical systems have proven effective for Web content quality assessment. But they are all language dependent features, which are not suitable ...
expand
|
||
| Exploiting endorsement information and social influence for item recommendation | ||
| Cheng-Te Li, Shou-De Lin, Man-Kwan Shan | ||
| Pages: 1131-1132 | ||
| doi>10.1145/2009916.2010084 | ||
|
Full text: |
||
|
Social networking services possess two features: (1) capturing the social relationships among people, represented by the social network, and (2) allowing users to express their preferences on different kinds of items (e.g. photo, celebrity, pages) through ...
expand
|
||
| Modeling subset distributions for verbose queries | ||
| Xiaobing Xue, W. Bruce Croft | ||
| Pages: 1133-1134 | ||
| doi>10.1145/2009916.2010085 | ||
|
Full text: |
||
|
Improving verbose (or long) queries poses a new challenge for search systems. Previous techniques mainly focused on two aspects, weighting the important words or phrases and selecting the best subset query. The former does not consider how words and ...
expand
|
||
| Domain expert topic familiarity and search behavior | ||
| Sarvnaz Karimi, Falk Scholer, Adam Clark, Sadegh Kharazmi | ||
| Pages: 1135-1136 | ||
| doi>10.1145/2009916.2010086 | ||
|
Full text: |
||
|
Users of information retrieval systems employ a variety of strategies when searching for information. One factor that can directly influence how searchers go about their information finding task is the level of familiarity with a search topic. We investigate ...
expand
|
||
| Sample selection for dictionary-based corpus compression | ||
| Christopher Hoobin, Simon Puglisi, Justin Zobel | ||
| Pages: 1137-1138 | ||
| doi>10.1145/2009916.2010087 | ||
|
Full text: |
||
|
Compression of large text corpora has the potential to drastically reduce both storage requirements and per-document access costs. Adaptive methods used for general-purpose compression are ineffective for this application, and historically the most successful ...
expand
|
||
| Evaluating medical information retrieval | ||
| Bevan Koopman, Peter Bruza, Laurianne Sitbon, Michael Lawley | ||
| Pages: 1139-1140 | ||
| doi>10.1145/2009916.2010088 | ||
|
Full text: |
||
|
This paper presents a framework for evaluating information retrieval of medical records. We use the BLULab corpus, a large collection of real-world de-identified medical records. The collection has been hand coded by clinical terminologists using the ...
expand
|
||
| Region-based landmark discovery by crowdsourcing geo-referenced photos | ||
| Yen-Ta Huang, An-Jung Cheng, Liang-Chi Hsieh, Winston Hsu, Kuo-Wei Chang | ||
| Pages: 1141-1142 | ||
| doi>10.1145/2009916.2010089 | ||
|
Full text: |
||
|
We propose a novel model for landmark discovery that locates region-based landmarks on map in contrast to the traditional point-based landmarks. The proposed method preserves more information and automatically identifies candidate regions on map by crowdsourcing ...
expand
|
||
| Towards effective short text deep classification | ||
| Xinruo Sun, Haofen Wang, Yong Yu | ||
| Pages: 1143-1144 | ||
| doi>10.1145/2009916.2010090 | ||
|
Full text: |
||
|
Recently, more and more short texts (e.g., ads, tweets) appear on the Web. Classifying short texts into a large taxonomy like ODP or Wikipedia category system has become an important mining task to improve the performance of many applications such as ...
expand
|
||
| Temporal latent semantic analysis for collaboratively generated content: preliminary results | ||
| Yu Wang, Eugene Agichtein | ||
| Pages: 1145-1146 | ||
| doi>10.1145/2009916.2010091 | ||
|
Full text: |
||
|
Latent semantic analysis (LSA) has been intensively studied because of its wide application to Information Retrieval and Natural Language Processing. Yet, traditional models such as LSA only examine one (current) version of the document. However, due ...
expand
|
||
| Self-adjusting hybrid recommenders based on social network analysis | ||
| Alejandro Bellogin, Pablo Castells, Ivan Cantador | ||
| Pages: 1147-1148 | ||
| doi>10.1145/2009916.2010092 | ||
|
Full text: |
||
|
Ensemble recommender systems successfully enhance recom-mendation accuracy by exploiting different sources of user prefe-rences, such as ratings and social contacts. In linear ensembles, the optimal weight of each recommender strategy is commonly tuned ...
expand
|
||
| BlogCast effect on information diffusion in a blogosphere | ||
| Sang-Wook Kim, Christos Faloutsos, Jiwoon Ha | ||
| Pages: 1149-1150 | ||
| doi>10.1145/2009916.2010093 | ||
|
Full text: |
||
|
A blog service company provides a function named BlogCast that exposes quality posts on the blog main page to vitalize a blogosphere. This paper analyzes a new type of information diffusion via BlogCast. We show that there exists a strong halo effect ...
expand
|
||
| Product comparison using comparative relations | ||
| Si Li, Zheng-Jun Zha, Zhaoyan Ming, Meng Wang, Tat-Seng Chua, Jun Guo, Weiran Xu | ||
| Pages: 1151-1152 | ||
| doi>10.1145/2009916.2010094 | ||
|
Full text: |
||
|
This paper proposes a novel Product Comparison approach. The comparative relations between products are first mined from both user reviews on multiple review websites and community-based question answering pairs containing product comparison information. ...
expand
|
||
| Collaborative cyberporn filtering with collective intelligence | ||
| Lung-Hao Lee, Hsin-Hsi Chen | ||
| Pages: 1153-1154 | ||
| doi>10.1145/2009916.2010095 | ||
|
Full text: |
||
|
This paper presents a user intent method to generate blacklists for collaborative cyberporn filtering. A novel porn detection framework that finds new pornographic web pages by mining user search behaviors is proposed. It employs users' clicks in search ...
expand
|
||
| Do IR models satisfy the TDC retrieval constraint | ||
| Stéphane Clinchant, Eric Gaussier | ||
| Pages: 1155-1156 | ||
| doi>10.1145/2009916.2010096 | ||
|
Full text: |
||
| On diversifying and personalizing web search | ||
| David Vallet, Pablo Castells | ||
| Pages: 1157-1158 | ||
| doi>10.1145/2009916.2010097 | ||
|
Full text: |
||
|
Diversification and personalization methods are common ap-proaches to deal with the one-size-fits-all paradigm of Web search engines. We performed a user study with 190 subjects where we analyzed the effects of diversification and personalization methods ...
expand
|
||
| Semantic tag recommendation using concept model | ||
| Chenliang Li, Anwitaman Datta, Aixin Sun | ||
| Pages: 1159-1160 | ||
| doi>10.1145/2009916.2010098 | ||
|
Full text: |
||
|
The common tags given by multiple users to a particular document are often semantically relevant to the document and each tag represents a specific topic. In this paper, we attempt to emulate human tagging behavior to recommend tags by considering the ...
expand
|
||
| Recommending interesting activity-related local entities | ||
| Jie Tang, Ryen W. White, Peter Bailey | ||
| Pages: 1161-1162 | ||
| doi>10.1145/2009916.2010099 | ||
|
Full text: |
||
|
When searching for entities with a strong local character (e.g., a museum), people may also be interested in discovering proximal activity-related entities (e.g., a café). Geographical proximity is a necessary, but not sufficient, qualifier for ...
expand
|
||
| Cross-corpus relevance projection | ||
| Nima Asadi, Donald Metzler, Jimmy Lin | ||
| Pages: 1163-1164 | ||
| doi>10.1145/2009916.2010100 | ||
|
Full text: |
||
| Location disambiguation for geo-tagged images | ||
| Zhu Zhu, Lidan Shou, Kuang Mao, Gang Chen | ||
| Pages: 1165-1166 | ||
| doi>10.1145/2009916.2010101 | ||
|
Full text: |
||
|
In this poster, we address the problem of location disambiguation for geotagged Web photo resources. We propose an approach for analyzing and partitioning large geotagged photo collections using geographic and semantic information. By organizing the ...
expand
|
||
| Towards an indexing method to speed-up music retrieval | ||
| Benjamin Martin, Pierre Hanna, Matthias Robine, Pascal Ferraro | ||
| Pages: 1167-1168 | ||
| doi>10.1145/2009916.2010102 | ||
|
Full text: |
||
|
Computations in most music retrieval systems strongly depend on the size of data compared. We propose to enhance performances of a music retrieval system, namely a harmonic similarity evaluation method, by first indexing relevant parts of music pieces. ...
expand
|
||
| An investigation of decompounding for cross-language patent search | ||
| Johannes Leveling, Walid Magdy, Gareth J.F. Jones | ||
| Pages: 1169-1170 | ||
| doi>10.1145/2009916.2010103 | ||
|
Full text: |
||
|
Decompounding has been found to improve information retrieval (IR) effectiveness in general domains for languages such as German or Dutch. We investigate if cross-language patent retrieval can profit from decompounding. This poses two challenges: i) ...
expand
|
||
| Detecting seasonal queries by time-series analysis | ||
| Milad Shokouhi | ||
| Pages: 1171-1172 | ||
| doi>10.1145/2009916.2010104 | ||
|
Full text: |
||
|
Seasonal events such as Halloween and Christmas repeat every year and initiate several temporal information needs. The impact of such events on users is often reflected in search logs in form of seasonal spikes in the frequency of related queries (e.g. ...
expand
|
||
| Learning to rank under tight budget constraints | ||
| Christian Pölitz, Ralf Schenkel | ||
| Pages: 1173-1174 | ||
| doi>10.1145/2009916.2010105 | ||
|
Full text: |
||
|
This paper investigates the influence of pruning feature lists to keep a given budget for the evaluation of ranking methods. We learn from a given training set how important the individual prefixes are for the ranking quality. Based on there importance ...
expand
|
||
| A novel hybrid index structure for efficient text retrieval | ||
| Andreas Broschart, Ralf Schenkel | ||
| Pages: 1175-1176 | ||
| doi>10.1145/2009916.2010106 | ||
|
Full text: |
||
|
Query processing with precomputed term pair lists can improve efficiency for some queries, but suffers from the quadratic number of index lists that need to be read. We present a novel hybrid index structure that aims at decreasing the number of index ...
expand
|
||
| A weighted curve fitting method for result merging in federated search | ||
| Chuan He, Dzung Hong, Luo Si | ||
| Pages: 1177-1178 | ||
| doi>10.1145/2009916.2010107 | ||
|
Full text: |
||
|
Result merging is an important step in federated search to merge the documents returned from multiple source-specific ranked lists for a user query. Previous result merging methods such as Semi-Supervised Learning (SSL) and Sample- Agglomerate Fitting ...
expand
|
||
| Effect of different docid orderings on dynamic pruning retrieval strategies | ||
| Nicola Tonellotto, Craig Macdonald, Iadh Ounis | ||
| Pages: 1179-1180 | ||
| doi>10.1145/2009916.2010108 | ||
|
Full text: |
||
|
Document-at-a-time (DAAT) dynamic pruning strategies for information retrieval systems such as MaxScore and Wand can increase querying efficiency without decreasing effectiveness. Both work on posting lists sorted by ascending document identifier (docid). ...
expand
|
||
| Time-based query performance predictors | ||
| Nattiya Kanhabua, Kjetil Nørvåg | ||
| Pages: 1181-1182 | ||
| doi>10.1145/2009916.2010109 | ||
|
Full text: |
||
|
Query performance prediction is aimed at predicting the retrieval effectiveness that a query will achieve with respect to a particular ranking model. In this paper, we study query performance prediction for a ranking model that explicitly incorporates ...
expand
|
||
| Search task difficulty: the expected vs. the reflected | ||
| Jingjing Liu, Nicholas J. Belkin | ||
| Pages: 1183-1184 | ||
| doi>10.1145/2009916.2010110 | ||
|
Full text: |
||
|
We report findings on how the user's perception of task difficulty changes before and after searching for information to solve tasks. We found that while in one type of task, the dependent task, this did not change, in another, the parallel task, it ...
expand
|
||
| On the suitability of diversity metrics for learning-to-rank for diversity | ||
| Rodrygo L.T. Santos, Craig Macdonald, Iadh Ounis | ||
| Pages: 1185-1186 | ||
| doi>10.1145/2009916.2010111 | ||
|
Full text: |
||
|
An optimally diverse ranking should achieve the maximum coverage of the aspects underlying an ambiguous or underspecified query, with minimum redundancy with respect to the covered aspects. Although evaluation metrics that reward coverage and penalise ...
expand
|
||
| How diverse are web search results? | ||
| Rodrygo L.T. Santos, Craig Macdonald, Iadh Ounis | ||
| Pages: 1187-1188 | ||
| doi>10.1145/2009916.2010112 | ||
|
Full text: |
||
|
Search result diversification has recently gained attention as a means to tackle ambiguous queries. While query ambiguity is of particular concern for the short queries commonly observed in a Web search scenario, it is unclear how much diversity is actually ...
expand
|
||
| Analysis of an expert search query log | ||
| Yi Fang, Naveen Somasundaram, Luo Si, Jeongwoo Ko, Aditya P. Mathur | ||
| Pages: 1189-1190 | ||
| doi>10.1145/2009916.2010113 | ||
|
Full text: |
||
|
Expert search has made rapid progress in modeling, algorithms and evaluations in the recent years. However, there is very few work on analyzing how users interact with expert search systems. In this paper, we conduct analysis of an expert search query ...
expand
|
||
| A model for expert finding in social networks | ||
| Elena Smirnova | ||
| Pages: 1191-1192 | ||
| doi>10.1145/2009916.2010114 | ||
|
Full text: |
||
|
Expert finding is a task of finding knowledgeable people on a given topic. State-of-the-art expertise retrieval algorithms identify matching experts based on analysis of textual content of documents experts are associated with. While powerful, these ...
expand
|
||
| Transductive learning over automatically detected themes for multi-document summarization | ||
| Massih-Reza Amini, Nicolas Usunier | ||
| Pages: 1193-1194 | ||
| doi>10.1145/2009916.2010115 | ||
|
Full text: |
||
|
We propose a new method for query-biased multi-document summarization, based on sentence extraction. The summary of multiple documents is created in two steps. Sentences are first clustered; where each cluster corresponds to one of the main themes present ...
expand
|
||
| Rating-based collaborative filtering combined with additional regularization | ||
| Shu Wu, Shengrui Wang | ||
| Pages: 1195-1196 | ||
| doi>10.1145/2009916.2010116 | ||
|
Full text: |
||
|
The collaborative filtering (CF) approach to recommender system has received much attention recently. However, previous work mainly focuses on improving the formula of rating prediction, e.g. by adding user and item biases, implicit feedback and time-aware ...
expand
|
||
| Words-of-interest selection based on temporal motion coherence for video retrieval | ||
| Lei Wang, Dawei Song, Eyad Elyan | ||
| Pages: 1197-1198 | ||
| doi>10.1145/2009916.2010117 | ||
|
Full text: |
||
|
The "Bag of Visual Words" (BoW) framework has been widely used in query-by-example video retrieval to model the visual content by a set of quantized local feature descriptors. In this paper, we propose a novel technique to enhance BoW by the selection ...
expand
|
||
| Aggregating multiple opinion evidence in proximity-based opinion retrieval | ||
| Shima Gerani, Mostafa Keikha, Fabio Crestani | ||
| Pages: 1199-1200 | ||
| doi>10.1145/2009916.2010118 | ||
|
Full text: |
||
|
Blog post opinion retrieval is the problem of ranking blog posts according to the likelihood that the post is relevant to the query and that the author was expressing an opinion about the topic (of the query). A recent study has proposed a method for ...
expand
|
||
| Enhancing mobile search using web search log data | ||
| Yoshiyuki Inagaki, Jiang Bian, Yi Chang, Motoko Maki | ||
| Pages: 1201-1202 | ||
| doi>10.1145/2009916.2010119 | ||
|
Full text: |
||
|
Mobile search is still in infancy compared with general purpose web search. With limited training data and weak relevance features, the ranking performance in mobile search is far from satisfactory. To address this problem, we propose to leverage the ...
expand
|
||
| Award prediction with temporal citation network analysis | ||
| Zaihan Yang, Dawei Yin, Brian D. Davison | ||
| Pages: 1203-1204 | ||
| doi>10.1145/2009916.2010120 | ||
|
Full text: |
||
|
Each year many ACM SIG communities will recognize an outstanding researcher through an award in honor of his or her profound impact and numerous research contributions. This work is the first to investigate an automated mechanism to help in selecting ...
expand
|
||
| Rating prediction using feature words extracted from customer reviews | ||
| Masanao Ochi, Makoto Okabe, Rikio Onai | ||
| Pages: 1205-1206 | ||
| doi>10.1145/2009916.2010121 | ||
|
Full text: |
||
|
We developed a simple method of improving the accuracy of rating prediction using feature words extracted from customer reviews. Many rating predictors work well for a small and dense dataset of customer reviews. However, a practical dataset tends to ...
expand
|
||
| Ranking tags in resource collections | ||
| Dimitrios Skoutas, Mohammad Alrifai | ||
| Pages: 1207-1208 | ||
| doi>10.1145/2009916.2010122 | ||
|
Full text: |
||
|
We examine different tag ranking strategies for constructing tag clouds to represent collections of tagged objects. The proposed methods are based on random walk on graphs, diversification, and rank aggregation, and they are empirically evaluated on ...
expand
|
||
| Identifying similar people in professional social networks with discriminative probabilistic models | ||
| Suleyman Cetintas, Monica Rogati, Luo Si, Yi Fang | ||
| Pages: 1209-1210 | ||
| doi>10.1145/2009916.2010123 | ||
|
Full text: |
||
|
Identifying similar professionals is an important task for many core services in professional social networks. Information about users can be obtained from heterogeneous information sources, and different sources provide different insights on user similarity. ...
expand
|
||
| Intent-oriented diversity in recommender systems | ||
| Saul Vargas, Pablo Castells, David Vallet | ||
| Pages: 1211-1212 | ||
| doi>10.1145/2009916.2010124 | ||
|
Full text: |
||
|
Diversity as a relevant dimension of retrieval quality is receiving increasing attention in the Information Retrieval and Recommender Systems (RS) fields. The problem has nonetheless been approached under different views and formulations in IR and RS ...
expand
|
||
| Disambiguating biomedical acronyms using EMIM | ||
| Nut Limsopatham, Rodrygo L.T. Santos, Craig Macdonald, Iadh Ounis | ||
| Pages: 1213-1214 | ||
| doi>10.1145/2009916.2010125 | ||
|
Full text: |
||
|
Expanding a query with acronyms or their corresponding 'long-forms' has not been shown to provide consistent improvements in the biomedical IR literature. The major open issue with expanding acronyms in a query is their inherent ambiguity, as an acronym ...
expand
|
||
| Best document selection based on approximate utility optimization | ||
| Hungyu Henry Lin, Yi Zhang, James Davis | ||
| Pages: 1215-1216 | ||
| doi>10.1145/2009916.2010126 | ||
|
Full text: |
||
|
This poster describes an alternative approach to handling the best document selection problem. Best document selection is a common problem with many real world applications, but is not a well studied problem by itself; a simple solution would be to treat ...
expand
|
||
| Forecasting counts of user visits for online display advertising with probabilistic latent class models | ||
| Suleyman Cetintas, Datong Chen, Luo Si, Bin Shen, Zhanibek Datbayev | ||
| Pages: 1217-1218 | ||
| doi>10.1145/2009916.2010127 | ||
|
Full text: |
||
|
Display advertising is a multi-billion dollar industry where advertisers promote their products to users by having publishers display their advertisements on popular Web pages. An important problem in online advertising is how to forecast the number ...
expand
|
||
| Knowledge effects on document selection in search results pages | ||
| Michael J. Cole, Xiangmin Zhang, Chang Liu, Nicholas J. Belkin, Jacek Gwizdka | ||
| Pages: 1219-1220 | ||
| doi>10.1145/2009916.2010128 | ||
|
Full text: |
||
|
Click through events in search results pages (SERPs) are not reliable implicit indicators of document relevance. A user's task and domain knowledge are key factors in recognition and link selection and the most useful SERP document links may be those ...
expand
|
||
| Learning to rank from a noisy crowd | ||
| Abhimanu Kumar, Matthew Lease | ||
| Pages: 1221-1222 | ||
| doi>10.1145/2009916.2010129 | ||
|
Full text: |
||
|
We study how to best use crowdsourced relevance judgments learning to rank [1, 7]. We integrate two lines of prior work: unreliable crowd-based binary annotation for binary classification [5, 3], and aggregating graded relevance judgments from reliable ...
expand
|
||
| How to count thumb-ups and thumb-downs?: an information retrieval approach to user-rating based ranking of items | ||
| Dell Zhang, Robert Mao, Haitao Li, Joanne Mao | ||
| Pages: 1223-1224 | ||
| doi>10.1145/2009916.2010130 | ||
|
Full text: |
||
|
It is a common practice among Web 2.0 services to allow users to rate items on their sites. In this paper, we first point out the flaws of the popular methods for user-rating based ranking of items, and then argue that two well-known Information Retrieval ...
expand
|
||
| Predicting users' domain knowledge from search behaviors | ||
| Xiangmin Zhang, Michael Cole, Nicholas Belkin | ||
| Pages: 1225-1226 | ||
| doi>10.1145/2009916.2010131 | ||
|
Full text: |
||
|
This study uses regression modeling to predict a user's domain knowledge level (DK) from implicit evidence provided by certain search behaviors. A user study (n=35) with recall-oriented search tasks in the genomic domain was conducted. A number of regression ...
expand
|
||
| The interactive PRP for diversifying document rankings | ||
| Guido Zuccon, Leif Azzopardi, C.J. "Keith" van Rijsbergen | ||
| Pages: 1227-1228 | ||
| doi>10.1145/2009916.2010132 | ||
|
Full text: |
||
|
The assumptions underlying the Probability Ranking Principle (PRP) have led to a number of alternative approaches that cater or compensate for the PRP's limitations. In this poster we focus on the Interactive PRP (iPRP), which rejects the assumption ...
expand
|
||
| Detecting success in mobile search from interaction | ||
| Qi Guo, Shuai Yuan, Eugene Agichtein | ||
| Pages: 1229-1230 | ||
| doi>10.1145/2009916.2010133 | ||
|
Full text: |
||
|
Predicting searcher success and satisfaction is a key problem in Web search, which is essential for automatic evaluating and improving search engine performance. This problem has been studied actively in the desktop search setting, but not specifically ...
expand
|
||
| Measuring assessor accuracy: a comparison of nist assessors and user study participants | ||
| Mark D. Smucker, Chandra Prakash Jethani | ||
| Pages: 1231-1232 | ||
| doi>10.1145/2009916.2010134 | ||
|
Full text: |
||
|
In many situations, humans judging document relevance are forced to trade-off accuracy for speed. The development of better interactive retrieval systems and relevance assessing platforms requires the measurement of assessor accuracy, but to date the ...
expand
|
||
| A bipartite graph based social network splicing method for person name disambiguation | ||
| Jintao Tang, Qin Lu, Ting Wang, Ji Wang, Wenjie Li | ||
| Pages: 1233-1234 | ||
| doi>10.1145/2009916.2010135 | ||
|
Full text: |
||
|
The key issue of person name disambiguation is to discover different namesakes in massive web documents rather than simply cluster documents by using textual features. In this paper, we describe a novel person name disambiguation method based on social ...
expand
|
||
| Link formation analysis in microblogs | ||
| Dawei Yin, Liangjie Hong, Xiong Xiong, Brian D. Davison | ||
| Pages: 1235-1236 | ||
| doi>10.1145/2009916.2010136 | ||
|
Full text: |
||
|
Unlike a traditional social network service, a microblogging network like Twitter is a hybrid network, combining aspects of both social networks and information networks. Understanding the structure of such hybrid networks and to predict new links are ...
expand
|
||
| Evolution of web search results within years | ||
| Ismail Sengor Altingovde, Rifat Ozcan, Özgür Ulusoy | ||
| Pages: 1237-1238 | ||
| doi>10.1145/2009916.2010137 | ||
|
Full text: |
||
|
We provide a first large-scale analysis of the evolution of query results obtained from a real search engine at two distant points in time, namely, in 2007 and 2010, for a set of 630,000 real queries.
expand
|
||
| Decayed DivRank: capturing relevance, diversity and prestige in information networks | ||
| Pan Du, Jiafeng Guo, Xue-Qi Cheng | ||
| Pages: 1239-1240 | ||
| doi>10.1145/2009916.2010138 | ||
|
Full text: |
||
|
Many network-based ranking approaches have been proposed to rank objects according to different criteria, including relevance, prestige and diversity. However, existing approaches either only aim at one or two of the criteria, or handle them with additional ...
expand
|
||
| Multi-objective optimization in learning to rank | ||
| Na Dai, Milad Shokouhi, Brian D. Davison | ||
| Pages: 1241-1242 | ||
| doi>10.1145/2009916.2010139 | ||
|
Full text: |
||
|
Supervised learning to rank algorithms typically optimize for high relevance and ignore other facets of search quality, such as freshness and diversity. Prior work on multi-objective ranking trained rankers focused on using hybrid labels that combine ...
expand
|
||
| A large-scale study of the effect of training set characteristics over learning-to-rank algorithms | ||
| Evangelos Kanoulas, Stefan Savev, Pavel Metrikov, Virgil Pavlu, Javed Aslam | ||
| Pages: 1243-1244 | ||
| doi>10.1145/2009916.2010140 | ||
|
Full text: |
||
|
In this work we describe the results of a large-scale study on the effect of the distribution of labels across the different grades of relevance in the training set on the performance of trained ranking functions. In a controlled experiment we generate ...
expand
|
||
| Exploring term temporality for pseudo-relevance feedback | ||
| Stewart Whiting, Yashar Moshfeghi, Joemon M. Jose | ||
| Pages: 1245-1246 | ||
| doi>10.1145/2009916.2010141 | ||
|
Full text: |
||
|
As digital collections expand, the importance of the temporal aspect of information has become increasingly apparent. The aim of this paper is to investigate the effect of using long-term temporal profiles of terms in information retrieval by enhancing ...
expand
|
||
| MSSF: a multi-document summarization framework based on submodularity | ||
| Jingxuan Li, Lei Li, Tao Li | ||
| Pages: 1247-1248 | ||
| doi>10.1145/2009916.2010142 | ||
|
Full text: |
||
|
Multi-document summarization aims to distill the most representative information from a set of documents to generate a summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection ...
expand
|
||
| SEJoin: an optimized algorithm towards efficient approximate string searches | ||
| Junfeng Zhou, Ziyang Chen, Jingrong Zhang | ||
| Pages: 1249-1250 | ||
| doi>10.1145/2009916.2010143 | ||
|
Full text: |
||
|
We investigated the problem of finding from a collection of strings those similar to a given query string based on edit distance, for which the critical operation is merging inverted lists of grams generated from the collection of strings. We present ...
expand
|
||
| Bag-of-visual-words vs global image descriptors on two-stage multimodal retrieval | ||
| Konstantinos Zagoris, Savvas A. Chatzichristofis, Avi Arampatzis | ||
| Pages: 1251-1252 | ||
| doi>10.1145/2009916.2010144 | ||
|
Full text: |
||
|
The Bag-Of-Visual-Words (BOVW) paradigm is fast becoming a popular image representation for Content-Based Image Retrieval (CBIR), mainly because of its better retrieval effectiveness over global feature representations on collections with images being ...
expand
|
||
| Query term ranking based on search results overlap | ||
| Wei Song, Yu Zhang, Yubin Xie, Ting Liu, Sheng Li | ||
| Pages: 1253-1254 | ||
| doi>10.1145/2009916.2010145 | ||
|
Full text: |
||
|
In this paper, we propose a method to rank and assign weights to query terms according to their impact on the topic of the query. We use Search Result Overlap Ratio (SROR) to quantify the overlap of the search results of the full query and a shorten ...
expand
|
||
| Tossing coins to trim long queries | ||
| Sudip Datta, Vasudeva Varma | ||
| Pages: 1255-1256 | ||
| doi>10.1145/2009916.2010146 | ||
|
Full text: |
||
|
Verbose web queries are often descriptive in nature where a term based search engine is unable to distinguish between the essential and noisy words, which can result in a drift from the user intent. We present a randomized query reduction technique that ...
expand
|
||
| A comparison of time-aware ranking methods | ||
| Nattiya Kanhabua, Kjetil Nørvåg | ||
| Pages: 1257-1258 | ||
| doi>10.1145/2009916.2010147 | ||
|
Full text: |
||
|
When searching a temporal document collection, e.g., news archives or blogs, the time dimension must be explicitly incorporated into a retrieval model in order to improve relevance ranking. Previous work has followed one of two main approaches: 1) a ...
expand
|
||
| Learning for graphs with annotated edges | ||
| Fan Li | ||
| Pages: 1259-1260 | ||
| doi>10.1145/2009916.2010148 | ||
|
Full text: |
||
|
Automatic classification with graphs containing annotated edges is an interesting problem and has many potential applications. We present a risk minimization formulation that exploits the annotated edges for classification tasks. One major advantage ...
expand
|
||
| Formulating effective questions for community-based question answering | ||
| Saori Suzuki, Shin'ichi Nakayama, Hideo Joho | ||
| Pages: 1261-1262 | ||
| doi>10.1145/2009916.2010149 | ||
|
Full text: |
||
|
Community-based Question Answering (CQA) services have become a major venue for people's information seeking on the Web. However, many studies on CQA have focused on the prediction of the best answers for a given question. This paper looks into ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations | ||
| ClusteringWiki: personalized and collaborative clustering of search results | ||
| Dragos C. Anastasiu, Byron J. Gao, David Buttler | ||
| Pages: 1263-1264 | ||
| doi>10.1145/2009916.2010151 | ||
|
Full text: |
||
|
How to organize and present search results plays a critical role in the utility of search engines. Due to the unprecedented scale of the Web and diversity of search results, the common strategy of ranked lists has become increasingly inadequate, and ...
expand
|
||
| OrientSTS: spatio-temporal sequence searching in flickr | ||
| Chunjie Zhou, Dongqi Liu, Xiaofeng Meng | ||
| Pages: 1265-1266 | ||
| doi>10.1145/2009916.2010152 | ||
|
Full text: |
||
|
Nowadays, due to the increasing user requirements of efficient and personalized services, a perfect travel plan is urgently needed. However, at present it is hard for people to make a personalized traveling plan. Most of them follow other people's general ...
expand
|
||
| A toolkit for knowledge base population | ||
| Zheng Chen, Suzanne Tamang, Adam Lee, Heng Ji | ||
| Pages: 1267-1268 | ||
| doi>10.1145/2009916.2010153 | ||
|
Full text: |
||
|
The main goal of knowledge base population (KBP) is to distill entity information (e.g., facts of a person) from multiple unstructured and semi-structured data sources, and incorporate the information into a knowledge base (KB). In this work, we intend ...
expand
|
||
| iMecho: a context-aware desktop search system | ||
| Jidong Chen, Hang Guo, Wentao Wu, Wei Wang | ||
| Pages: 1269-1270 | ||
| doi>10.1145/2009916.2010154 | ||
|
Full text: |
||
|
In this demo, we present iMecho, a context-aware desktop search system to help users get more relevant results. Different from other desktop search engines, iMecho ranks results not only by the content of the query, but also the context of the query. ...
expand
|
||
| Visualizing and querying semantic social networks | ||
| Aixin Sun, Anwitaman Datta, Ee-Peng Lim, Kuiyu Chang | ||
| Pages: 1271-1272 | ||
| doi>10.1145/2009916.2010155 | ||
|
Full text: |
||
|
We demonstrate SSNetViz that is developed for integrating, visualizing and querying heterogeneous semantic social networks obtained from multiple information sources. A semantic social network refers to a social network graph with multi-typed nodes and ...
expand
|
||
| What-you-retrieve-is-what-you-see: a preliminary cyber-physical search engine | ||
| Lidan Shou, Ke Chen, Gang Chen, Chao Zhang, Yi Ma, Xian Zhang | ||
| Pages: 1273-1274 | ||
| doi>10.1145/2009916.2010156 | ||
|
Full text: |
||
|
The cyber-physical systems (CPS) are envisioned as a class of real-time systems integrating the computing, communication and storage facilities with monitoring and control of the physical world. One interesting CPS application in the mobile Internet ...
expand
|
||
| QuickView: advanced search of tweets | ||
| Xiaohua Liu, Long Jiang, Furu Wei, Ming Zhou, QuickView Team Microsoft | ||
| Pages: 1275-1276 | ||
| doi>10.1145/2009916.2010157 | ||
|
Full text: |
||
|
Tweets have become a comprehensive repository for real-time information. However, it is often hard for users to quickly get information they are interested in from tweets, owing to the sheer volume of tweets as well as their noisy and informal nature. ...
expand
|
||
| Personalized video: leanback online video consumption | ||
| Krishnan Ramanathan, Yogesh Sankarasubramaniam, Vidhya Govindaraju | ||
| Pages: 1277-1278 | ||
| doi>10.1145/2009916.2010158 | ||
|
Full text: |
||
|
Current user interfaces for online video consumption are mostly browser based, lean forward, require constant interaction and provide a fragmented view of the total content available. For easier consumption, the user interface and interactions need to ...
expand
|
||
| GreenMeter: a tool for assessing the quality and recommending tags for web 2.0 applications | ||
| Saulo M.R. Ricci, Dilson A. Guimarães, Fabiano M. Belém, Jussara M. Almeida, Marcos A. Gonçalves, Raquel Prates | ||
| Pages: 1279-1280 | ||
| doi>10.1145/2009916.2010159 | ||
|
Full text: |
||
|
We present GreenMeter, a tool for assessing the quality and recommending tags for Web 2.0 content. Its goal is to improve tag quality and the effectiveness of various information services (e.g., search, content recommendation) that rely on tags as data ...
expand
|
||
| JuSe: a picture dictionary query system for children | ||
| Tamara Polajnar, Richard Glassey, Leif Azzopardi | ||
| Pages: 1281-1282 | ||
| doi>10.1145/2009916.2010160 | ||
|
Full text: |
||
|
As adults we take for granted our capacity to express our information needs verbally and textually. However, young children also have preferences and information needs, but are just learning to be able to express themselves effectively. Consequently ...
expand
|
||
| CrowdTracker: enabling community-based real-time web monitoring | ||
| James Caverlee, Zhiyuan Cheng, Brian Eoff, Chiao-Fang Hsu, Krishna Kamath, Jeffrey McGee | ||
| Pages: 1283-1284 | ||
| doi>10.1145/2009916.2010161 | ||
|
Full text: |
||
|
CrowdTracker is a community-based web monitoring system optimized for real-time web streams like Twitter, Facebook, and Google Buzz. In this demo summary, we provide an overview of the system and architecture, and outline the demonstration plan.
expand
|
||
| The Meta-Dex Suite: generating and analyzing indexes and meta-indexes | ||
| Michael Huggett, Edie Rasmussen | ||
| Pages: 1285-1286 | ||
| doi>10.1145/2009916.2010162 | ||
|
Full text: |
||
|
Our Meta-dex software suite extracts content and index text from a corpus of PDF files, and generates a meta-index that references entries across an entire domain. We provide tools to analyze the individual and integrated indexes, and visualize entries ...
expand
|
||
| Tulsa: web search for writing assistance | ||
| Duo Ding, Xingping Jiang, Matthew R. Scott, Ming Zhou, Yong Yu | ||
| Pages: 1287-1288 | ||
| doi>10.1145/2009916.2010163 | ||
|
Full text: |
||
| The TREC files: the (ground) truth is out there | ||
| Savvas A. Chatzichristofis, Konstantinos Zagoris, Avi Arampatzis | ||
| Pages: 1289-1290 | ||
| doi>10.1145/2009916.2010164 | ||
|
Full text: |
||
|
Traditional tools for information retrieval (IR) evaluation, such as TREC's trec_eval, have outdated command-line interfaces with many unused features, or 'switches', accumulated over the years. They are usually seen as cumbersome applications by new ...
expand
|
||
| A tool for comparative IR evaluation on component level | ||
| Thomas Wilhelm, Jens Kürsten, Maximilian Eibl | ||
| Pages: 1291-1292 | ||
| doi>10.1145/2009916.2010165 | ||
|
Full text: |
||
| TUTORIAL SESSION: Tutorials | ||
| Machine learning for information retrieval | ||
| Luo Si, Rong Jin | ||
| Pages: 1293-1294 | ||
| doi>10.1145/2009916.2010167 | ||
|
Full text: |
||
|
In recent years, we have witnessed successful application of machine learning techniques to a wide range of information retrieval problems, including Web search engines, recommendation systems, online advertising, etc. It is thus critical for researchers ...
expand
|
||
| Enhancing web search by mining search and browse logs | ||
| Daxin Jiang, Jian Pei, Hang Li | ||
| Pages: 1295-1296 | ||
| doi>10.1145/2009916.2010168 | ||
|
Full text: |
||
|
Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be ...
expand
|
||
| A new look at old tricks: the fertile roots of current research | ||
| Paul B. Kantor | ||
| Pages: 1297-1298 | ||
| doi>10.1145/2009916.2010169 | ||
|
Full text: |
||
|
As we face an explosion of potential new applications for the fundamental concepts and technologies of information retrieval, ranging from ad ranking to social media, from collaborative recommending to question answering systems, many researchers are ...
expand
|
||
| Crowdsourcing for information retrieval: principles, methods, and applications | ||
| Omar Alonso, Matthew Lease | ||
| Pages: 1299-1300 | ||
| doi>10.1145/2009916.2010170 | ||
|
Full text: |
||
|
Crowdsourcing has emerged in recent years as a promising new avenue for leveraging today's digitally-connected, diverse, distributed workforce. Generally speaking, crowdsourcing describes outsourcing of tasks to a large group of people instead of assigning ...
expand
|
||
| Practical online retrieval evaluation | ||
| Filip Radlinski, Yisong Yue | ||
| Pages: 1301-1302 | ||
| doi>10.1145/2009916.2010171 | ||
|
Full text: |
||
|
Online evaluation is amongst the few evaluation techniques available to the information retrieval community that is guaranteed to reflect how users actually respond to improvements developed by the community. Broadly speaking, online evaluation refers ...
expand
|
||
| Web retrieval: the role of users | ||
| Ricardo Baeza-Yates, Yoelle Maarek | ||
| Pages: 1303-1304 | ||
| doi>10.1145/2009916.2010172 | ||
|
Full text: |
||
|
Web retrieval methods have evolved through three major steps in the last decade or so. They started from standard document-centric IR in the early days of the Web, then made a major step forward by leveraging the structure of the Web, using link analysis ...
expand
|
||
| Information organization and retrieval with collaboratively generated content | ||
| Eugene Agichtein, Evgeniy Gabrilovich | ||
| Pages: 1307-1308 | ||
| doi>10.1145/2009916.2010173 | ||
|
Full text: |
||
|
Proliferation of ubiquitous access to the Internet enables millions of Web users to collaborate online on a variety of activities. Many of these activities result in the construction of large repositories of knowledge, either as their primary aim (e.g., ...
expand
|
||
| SESSION: Doctoral consortium | ||
| Persistence in the ephemeral: utilizing repeat behaviors for multi-session personalized search | ||
| Sarah K. Tyler | ||
| Pages: 1311-1312 | ||
| doi>10.1145/2009916.2010175 | ||
|
Full text: |
||
|
As the abundance of information on the Internet grows, an increasing burden is placed on the user to specify his or her query precisely in order to avoid extraneous results that may be relevant, but not useful. At the same time, users have a tendency ...
expand
|
||
| Search engines that learn online | ||
| Katja Hofmann | ||
| Pages: 1313-1314 | ||
| doi>10.1145/2009916.2010176 | ||
|
Full text: |
||
|
The goal of my research is to develop self-learning search engines, that can learn online, i.e., directly from interactions with actual users. Such systems can continuously adapt to user preferences throughout their lifetime, leading to better search ...
expand
|
||
| Query expansion based on a semantic graph model | ||
| Xue Jiang | ||
| Pages: 1315-1316 | ||
| doi>10.1145/2009916.2010177 | ||
|
Full text: |
||
|
Query expansion is a classical topic in the field of information retrieval, which is proposed to bridge the gap between searchers' information intents and their queries. Previous researches usually expand queries based on document collections, or some ...
expand
|
||
| Descriptive modelling of text classification and its integration with other IR tasks | ||
| Miguel Martinez-Alvarez | ||
| Pages: 1317-1318 | ||
| doi>10.1145/2009916.2010178 | ||
|
Full text: |
||
|
Nowadays, Information Retrieval (IR) systems have to deal with multiple sources of data available in different formats. Datasets can consist of complex and heterogeneous objects with relationships between them. In addition, information needs can vary ...
expand
|
||
| Efficient and effective solutions for search engines | ||
| Xiang-Fei Jia | ||
| Pages: 1319-1320 | ||
| doi>10.1145/2009916.2010179 | ||
|
Full text: |
||
| Modeling document scores for distributed information retrieval | ||
| Ilya Markov | ||
| Pages: 1321-1322 | ||
| doi>10.1145/2009916.2010180 | ||
|
Full text: |
||
|
Distributed Information Retrieval (DIR), also known as Federated Search, integrates multiple searchable collections and provides direct access to them through a unified interface [3]. This is done by a centralized broker, that receives user queries, ...
expand
|
||
| Improving query and result list adaptation in personalized multilingual information retrieval | ||
| M. Rami Ghorab | ||
| Pages: 1323-1324 | ||
| doi>10.1145/2009916.2010181 | ||
|
Full text: |
||
|
A general characteristic of Information Retrieval (IR) and Multilingual IR (MIR) [5] systems is that if the same query was submitted by different users, the system would yield the same results, regardless of the user. On the other hand, Adaptive Hypermedia ...
expand
|
||
| Using k-Top retrieved web snippets to date temporalimplicit queries based on web content analysis | ||
| Ricardo Nuno Taborda Campos | ||
| Pages: 1325-1326 | ||
| doi>10.1145/2009916.2010182 | ||
|
Full text: |
||
|
The World Wide Web (WWW) is a huge information network from which retrieving and organizing quality relevant content remains an open question for mostly all ambiguous queries. As an example, many queries have temporal implicit intents associated with ...
expand
|
||
| Domain-specific information retrieval using rcommenders | ||
| Wei Li | ||
| Pages: 1327-1328 | ||
| doi>10.1145/2009916.2010183 | ||
|
Full text: |
||
|
The continuing increase in the volume of information available in our daily lives is creating ever greater challenges for people to find personally useful information. One approach used to addressing this problem is Personalized Information Retrieval ...
expand
|
||
| Understanding and using contextual information in recommender systems | ||
| Licai Wang | ||
| Pages: 1329-1330 | ||
| doi>10.1145/2009916.2010184 | ||
|
Full text: |
||
| Multidimensional search result diversification: diverse search results for diverse users | ||
| Sumit Bhatia | ||
| Pages: 1331-1332 | ||
| doi>10.1145/2009916.2010185 | ||
|
Full text: |
||
|
Hundreds of millions of people today rely on Web based Search Engines to satisfy their information needs. In order to meet the expectations of this vast and diverse user population, the search engine should present a list of results such that the probability ...
expand
|
||
| SESSION: Industrial track | ||
| Sensor-aided mobile information management and retrieval | ||
| Edward Y. Chang | ||
| Pages: 1333-1334 | ||
| doi>10.1145/2009916.2010187 | ||
|
Full text: |
||
|
The number of "smart" mobile devices such as wireless phones and tablet computers has been rapidly growing. These mobile devices are equipped with a variety of sensors such as camera, gyroscope, accelerometer, compass, NFC, WiFi, GPS, etc. These sensors ...
expand
|
||
| Predicting eBay listing conversion | ||
| Ted Tao Yuan, Zhaohui Chen, Mike Mathieson | ||
| Pages: 1335-1336 | ||
| doi>10.1145/2009916.2010188 | ||
|
Full text: |
||
|
At eBay Market Place, listing conversion rate can be measured by number of items sold divided by number of items in a sample set. For a given item, conversion rate can also be treated as the probability of sale. By investigating eBay listings' transactional ...
expand
|
||
| A large scale machine learning system for recommending heterogeneous content in social networks | ||
| Yanxin Shi, David Ye, Andrey Goder, Srinivas Narayanan | ||
| Pages: 1337-1338 | ||
| doi>10.1145/2009916.2010189 | ||
|
Full text: |
||
|
The goal of the Facebook recommendation engine is to compare and rank heterogeneous types of content in order to find the most relevant recommendations based on user preference and page context. The challenges for such a recommendation engine include ...
expand
|
||
| Review of MSR-Bing web scale speller challenge | ||
| Kuansan Wang, Jan Pedersen | ||
| Pages: 1339-1340 | ||
| doi>10.1145/2009916.2010190 | ||
|
Full text: |
||
|
In this paper, we provide an overview of the MSR-Bing Web Scale Speller Challenge of 2011. We describe the motivation and outline the algorithmic and engineering challenges posed by this activity. The design and the evaluation methods are also reviewed, ...
expand
|
||
| Elsevier SIGIR 2011 application challenge abstract | ||
| Jukka Valimaki, Remko Caprio | ||
| Pages: 1341-1342 | ||
| doi>10.1145/2009916.2010191 | ||
|
Full text: |
||
|
Elsevier SIGIR 2011 Application Challenge is an international competition that encourages software developers to create applications that run on Elsevier's SciVerse platform. The Challenge is open to all SIGIR 2011 Conference participants.
expand
|
||
Welcome to the 34th ACM SIGIR International Conference on Research and Development in Information Retrieval. The record number of papers in this year's conference represents both the breadth and depth of the research being done in this vibrant field, both in academia and industry. We have done our best to ensure that these papers meet high standards of quality in terms of presentation, citations, and experimental methodology. At the same time, we have tried to be flexible in the application of these criteria in order to accept papers describing novel and innovative work that may be somewhat unconventional.
The conference received 543 full paper submissions this year, with 240 (44%) coming from Asia and Pacific region, 185 (34%) from the Americas, and 112 (21%) from Europe (the rest were "unknown"). Of these papers, 108 (19.9%) were accepted, up from the acceptance rate of 16.7% in last year's conference. The top five countries in terms of accepted papers were the U.S.A. (52), China (18), Germany (7), and then the U.K. and Spain (both 5). In addition, 274 short papers were submitted to the poster track, of which 89 (32.5%) were accepted. In the other categories, there were 15 (42.8%) demonstrations, 8 workshops, and 11 half-day tutorials accepted. In terms of the technical areas that the accepted papers cover, using the primary keyword assigned by the authors, the top five areas are document representation and content analysis (20%), retrieval models and ranking (17%), users and interactive IR (13%), queries and query analysis (11%), and filtering and recommendation (11%). Perhaps the only surprise there is the increase in the number of papers in filtering and recommendation. We believe that the papers at this year's conference provide an excellent cross-section of what is going on in our field. We hope that you find that reading them and listening to the presenters to be a rewarding experience.
SIGIR uses a two-tier double blind review system. For the full papers, the first step is that at least three first-tier reviewers read every paper and provide ratings and comments. Then two additional reviewers, referred to as the primary or secondary area chairs, study those reviews, and introduce their own opinions and summaries where appropriate by making additional comments. In some cases, the area chairs initiate the discussion among the first-tier reviewers to work out any controversial issues or significant differences of opinion. A new step introduced this year was to request author feedback for specific issues in some papers. Another change this year was that final decisions for nearly all papers were made by the two area chairs together with the reviewers. At the program committee meeting in Barcelona, the program chairs and some area chairs went over the reviews, obtained additional input, and made decisions in the few cases where the area chairs had requested more discussion.
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
|
Tools and Resources
Share: |
|||||||||||||
| Is the cranfield paradigm outdated? | ||
| Donna Harman | ||
| Pages: 1-1 | ||
| doi>10.1145/1835449.1835450 | ||
|
Full text: |
||
| SESSION: Clustering I | ||
| Gabriella Pasi | ||
| Prototype hierarchy based clustering for the categorization and navigation of web collections | ||
| Zhao-Yan Ming, Kai Wang, Tat-Seng Chua | ||
| Pages: 2-9 | ||
| doi>10.1145/1835449.1835453 | ||
|
Full text: |
||
|
This paper presents a novel prototype hierarchy based clustering (PHC) framework for the organization of web collections. It solves simultaneously the problem of categorizing web collections and interpreting the clustering results for navigation. By ...
expand
|
||
| Person name disambiguation by bootstrapping | ||
| Minoru Yoshida, Masaki Ikeda, Shingo Ono, Issei Sato, Hiroshi Nakagawa | ||
| Pages: 10-17 | ||
| doi>10.1145/1835449.1835454 | ||
|
Full text: |
||
|
In this paper, we report our system that disambiguates person names in Web search results. The system uses named entities, compound key words, and URLs as features for document similarity calculation, which typically show high precision but low recall ...
expand
|
||
| Self-taught hashing for fast similarity search | ||
| Dell Zhang, Jun Wang, Deng Cai, Jinsong Lu | ||
| Pages: 18-25 | ||
| doi>10.1145/1835449.1835455 | ||
|
Full text: |
||
|
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is semantic hashing which designs compact binary codes for a large number of ...
expand
|
||
| SESSION: User models | ||
| Ian Ruthven | ||
| Personalizing information retrieval for multi-session tasks: the roles of task stage and task type | ||
| Jingjing Liu, Nicholas J. Belkin | ||
| Pages: 26-33 | ||
| doi>10.1145/1835449.1835457 | ||
|
Full text: |
||
|
Dwell time as a user behavior has been found in previous studies to be an unreliable predictor of document usefulness, with contextual factors such as the user's task needing to be considered in its interpretation. Task stage has been shown to influence ...
expand
|
||
| Predicting searcher frustration | ||
| Henry A. Feild, James Allan, Rosie Jones | ||
| Pages: 34-41 | ||
| doi>10.1145/1835449.1835458 | ||
|
Full text: |
||
|
When search engine users have trouble finding information, they may become frustrated, possibly resulting in a bad experience (even if they are ultimately successful). In a user study in which participants were given difficult information seeking tasks, ...
expand
|
||
| The good, the bad, and the random: an eye-tracking study of ad quality in web search | ||
| Georg Buscher, Susan T. Dumais, Edward Cutrell | ||
| Pages: 42-49 | ||
| doi>10.1145/1835449.1835459 | ||
|
Full text: |
||
|
We investigate how people interact with Web search engine result pages using eye-tracking. While previous research has focused on the visual attention devoted to the 10 organic search results, this paper examines other components of contemporary search ...
expand
|
||
| SESSION: Applications I | ||
| Luo Si | ||
| Ranking using multiple document types in desktop search | ||
| Jinyoung Kim, W. Bruce Croft | ||
| Pages: 50-57 | ||
| doi>10.1145/1835449.1835461 | ||
|
Full text: |
||
|
A typical desktop environment contains many document types (email, presentations, web pages, pdfs, etc.) each with different metadata. Predicting which types of documents a user is looking for in the context of a given query is a crucial part of providing ...
expand
|
||
| Acquisition of instance attributes via labeled and related instances | ||
| Enrique Alfonseca, Marius Pasca, Enrique Robledo-Arnuncio | ||
| Pages: 58-65 | ||
| doi>10.1145/1835449.1835462 | ||
|
Full text: |
||
|
This paper presents a method for increasing the quality of automatically extracted instance attributes by exploiting weakly-supervised and unsupervised instance relatedness data. This data consists of (a) class labels for instances and (b) distributional ...
expand
|
||
| Relevance and ranking in online dating systems | ||
| Fernando Diaz, Donald Metzler, Sihem Amer-Yahia | ||
| Pages: 66-73 | ||
| doi>10.1145/1835449.1835463 | ||
|
Full text: |
||
|
Match-making systems refer to systems where users want to meet other individuals to satisfy some underlying need. Examples of match-making systems include dating services, resume/job bulletin boards, community based question answering, and consumer-to-consumer ...
expand
|
||
| SESSION: Search engine architectures and scalability | ||
| Alistair Moffat | ||
| Scalability of findability: effective and efficient IR operations in large information networks | ||
| Weimao Ke, Javed Mostafa | ||
| Pages: 74-81 | ||
| doi>10.1145/1835449.1835465 | ||
|
Full text: |
||
|
It is crucial to study basic principles that support adaptive and scalable retrieval functions in large networked environments such as the Web, where information is distributed among dynamic systems. We conducted experiments on decentralized IR operations ...
expand
|
||
| Caching search engine results over incremental indices | ||
| Roi Blanco, Edward Bortnikov, Flavio Junqueira, Ronny Lempel, Luca Telloli, Hugo Zaragoza | ||
| Pages: 82-89 | ||
| doi>10.1145/1835449.1835466 | ||
|
Full text: |
||
|
A Web search engine must update its index periodically to incorporate changes to the Web. We argue in this paper that index updates fundamentally impact the design of search engine result caches, a performance-critical component of modern search engines. ...
expand
|
||
| Query forwarding in geographically distributed search engines | ||
| B. Barla Cambazoglu, Emre Varol, Enver Kayaaslan, Cevdet Aykanat, Ricardo Baeza-Yates | ||
| Pages: 90-97 | ||
| doi>10.1145/1835449.1835467 | ||
|
Full text: |
||
|
Query forwarding is an important technique for preserving the result quality in distributed search engines where the index is geographically partitioned over multiple search sites. The key component in query forwarding is the thresholding algorithm by ...
expand
|
||
| A joint probabilistic classification model for resource selection | ||
| Dzung Hong, Luo Si, Paul Bracke, Michael Witt, Tim Juchcinski | ||
| Pages: 98-105 | ||
| doi>10.1145/1835449.1835468 | ||
|
Full text: |
||
|
Resource selection is an important task in Federated Search to select a small number of most relevant information sources. Current resource selection algorithms such as GlOSS, CORI, ReDDE, Geometric Average and the recent classification-based method ...
expand
|
||
| SESSION: Link analysis & advertising | ||
| Tie-Yan Liu | ||
| Temporal click model for sponsored search | ||
| Wanhong Xu, Eren Manavoglu, Erick Cantu-Paz | ||
| Pages: 106-113 | ||
| doi>10.1145/1835449.1835470 | ||
|
Full text: |
||
|
Previous studies on search engine click modeling have identified two presentation factors that affect users' behavior: (1) position bias: the same result will get a different number of clicks when displayed in different positions and (2) externalities: ...
expand
|
||
| Freshness matters: in flowers, food, and web authority | ||
| Na Dai, Brian D. Davison | ||
| Pages: 114-121 | ||
| doi>10.1145/1835449.1835471 | ||
|
Full text: |
||
|
The collective contributions of billions of users across the globe each day result in an ever-changing web. In verticals like news and real-time search, recency is an obvious significant factor for ranking. However, traditional link-based web ranking ...
expand
|
||
| The importance of anchor text for ad hoc search revisited | ||
| Marijn Koolen, Jaap Kamps | ||
| Pages: 122-129 | ||
| doi>10.1145/1835449.1835472 | ||
|
Full text: |
||
|
It is generally believed that propagated anchor text is very important for effective Web search as offered by the commercial search engines. "Google Bombs" are a notable illustration of this. However, many years of TREC Web retrieval research failed ...
expand
|
||
| Ready to buy or just browsing?: detecting web searcher goals from interaction data | ||
| Qi Guo, Eugene Agichtein | ||
| Pages: 130-137 | ||
| doi>10.1145/1835449.1835473 | ||
|
Full text: |
||
|
An improved understanding of the relationship between search intent, result quality, and searcher behavior is crucial for improving the effectiveness of web search. While recent progress in user behavior mining has been largely focused on aggregate server-side ...
expand
|
||
| SESSION: Learning to rank | ||
| Hang Li | ||
| Learning to efficiently rank | ||
| Lidan Wang, Jimmy Lin, Donald Metzler | ||
| Pages: 138-145 | ||
| doi>10.1145/1835449.1835475 | ||
|
Full text: |
||
|
It has been shown that learning to rank approaches are capable of learning highly effective ranking functions. However, these approaches have mostly ignored the important issue of efficiency. Given that both efficiency and effectiveness are important ...
expand
|
||
| Ranking for the conversion funnel | ||
| Abraham Bagherjeiran, Andrew O. Hatch, Adwait Ratnaparkhi | ||
| Pages: 146-153 | ||
| doi>10.1145/1835449.1835476 | ||
|
Full text: |
||
|
In contextual advertising advertisers show ads to users so that they will click on them and eventually purchase a product. Optimizing this action sequence, called the conversion funnel, is the ultimate goal of advertising. Advertisers, however, often ...
expand
|
||
| How good is a span of terms?: exploiting proximity to improve web retrieval | ||
| Krysta M. Svore, Pallika H. Kanani, Nazan Khan | ||
| Pages: 154-161 | ||
| doi>10.1145/1835449.1835477 | ||
|
Full text: |
||
|
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features ...
expand
|
||
| Learning to rank only using training data from related domain | ||
| Wei Gao, Peng Cai, Kam-Fai Wong, Aoying Zhou | ||
| Pages: 162-169 | ||
| doi>10.1145/1835449.1835478 | ||
|
Full text: |
||
|
Like traditional supervised and semi-supervised algorithms, learning to rank for information retrieval requires document annotations provided by domain experts. It is costly to annotate training data for different search domains and tasks. We propose ...
expand
|
||
| SESSION: Clustering II | ||
| Omar Alonso | ||
| Optimal meta search results clustering | ||
| Claudio Carpineto, Giovanni Romano | ||
| Pages: 170-177 | ||
| doi>10.1145/1835449.1835480 | ||
|
Full text: |
||
|
By analogy with merging documents rankings, the outputs from multiple search results clustering algorithms can be combined into a single output. In this paper we study the feasibility of meta search results clustering, which has unique features compared ...
expand
|
||
| Analysis of structural relationships for hierarchical cluster labeling | ||
| Markus Muhr, Roman Kern, Michael Granitzer | ||
| Pages: 178-185 | ||
| doi>10.1145/1835449.1835481 | ||
|
Full text: |
||
|
Cluster label quality is crucial for browsing topic hierarchies obtained via document clustering. Intuitively, the hierarchical structure should influence the labeling accuracy. However, most labeling algorithms ignore such structural properties and ...
expand
|
||
| On the existence of obstinate results in vector space models | ||
| Milos Radovanović, Alexandros Nanopoulos, Mirjana Ivanović | ||
| Pages: 186-193 | ||
| doi>10.1145/1835449.1835482 | ||
|
Full text: |
||
|
The vector space model (VSM) is a popular and widely applied model in information retrieval (IR). VSM creates vector spaces whose dimensionality is usually high (e.g., tens of thousands of terms). This may cause various problems, such as susceptibility ...
expand
|
||
| SESSION: Filtering and recommendation | ||
| Douglas W. Oard | ||
| Social media recommendation based on people and tags | ||
| Ido Guy, Naama Zwerdling, Inbal Ronen, David Carmel, Erel Uziel | ||
| Pages: 194-201 | ||
| doi>10.1145/1835449.1835484 | ||
|
Full text: |
||
|
We study personalized item recommendation within an enterprise social media application suite that includes blogs, bookmarks, communities, wikis, and shared files. Recommendations are based on two of the core elements of social media - people and tags. ...
expand
|
||
| A network-based model for high-dimensional information filtering | ||
| Nikolaos Nanas, Manolis Vavalis, Anne De Roeck | ||
| Pages: 202-209 | ||
| doi>10.1145/1835449.1835485 | ||
|
Full text: |
||
|
The Vector Space Model has been and to a great extent still is the de facto choice for profile representation in content-based Information Filtering. However, user profiles represented as weighted keyword vectors have inherent dimensionality problems. ...
expand
|
||
| Temporal diversity in recommender systems | ||
| Neal Lathia, Stephen Hailes, Licia Capra, Xavier Amatriain | ||
| Pages: 210-217 | ||
| doi>10.1145/1835449.1835486 | ||
|
Full text: |
||
|
Collaborative Filtering (CF) algorithms, used to build web-based recommender systems, are often evaluated in terms of how accurately they predict user ratings. However, current evaluation techniques disregard the fact that users continue to rate ...
expand
|
||
| Serendipitous recommendations via innovators | ||
| Noriaki Kawamae | ||
| Pages: 218-225 | ||
| doi>10.1145/1835449.1835487 | ||
|
Full text: |
||
|
To realize services that provide serendipity, this paper assesses the surprise of each user when presented recommendations. We propose a recommendation algorithm that focuses on the search time that, in the absence of any recommendation, each user would ...
expand
|
||
| SESSION: Information retrieval theory | ||
| Iadh Ounis | ||
| On statistical analysis and optimization of information retrieval effectiveness metrics | ||
| Jun Wang, Jianhan Zhu | ||
| Pages: 226-233 | ||
| doi>10.1145/1835449.1835489 | ||
|
Full text: |
||
|
This paper presents a new way of thinking for IR metric optimization. It is argued that the optimal ranking problem should be factorized into two distinct yet interrelated stages: the relevance prediction stage and ranking decision stage. During retrieval ...
expand
|
||
| Information-based models for ad hoc IR | ||
| Stéphane Clinchant, Eric Gaussier | ||
| Pages: 234-241 | ||
| doi>10.1145/1835449.1835490 | ||
|
Full text: |
||
|
We introduce in this paper the family of information-based models for ad hoc information retrieval. These models draw their inspiration from a long-standing hypothesis in IR, namely the fact that the difference in the behaviors of a word at the ...
expand
|
||
| Score distribution models: assumptions, intuition, and robustness to score manipulation | ||
| Evangelos Kanoulas, Keshi Dai, Virgil Pavlu, Javed A. Aslam | ||
| Pages: 242-249 | ||
| doi>10.1145/1835449.1835491 | ||
|
Full text: |
||
|
Inferring the score distribution of relevant and non-relevant documents is an essential task for many IR applications (e.g. information filtering, recall-oriented IR, meta-search, distributed IR). Modeling score distributions in an accurate manner is ...
expand
|
||
| Refactoring the search problem | ||
| Gary William Flake | ||
| Pages: 250-250 | ||
| doi>10.1145/1835449.1835451 | ||
|
Full text: |
||
|
The most common way of framing the search problem is as an exchange between a user and a database, where the user issues queries and the database replies with results that satisfy constraints imposed by the query but that also optimize some notion of ...
expand
|
||
| SESSION: Language models & IR theory | ||
| Geometric representations for multiple documents | ||
| Jangwon Seo, W. Bruce Croft | ||
| Pages: 251-258 | ||
| doi>10.1145/1835449.1835493 | ||
|
Full text: |
||
|
Combining multiple documents to represent an information object is well-known as an effective approach for many Information Retrieval tasks. For example, passages can be combined to represent a document for retrieval, document clusters are represented ...
expand
|
||
| Using statistical decision theory and relevance models for query-performance prediction | ||
| Anna Shtok, Oren Kurland, David Carmel | ||
| Pages: 259-266 | ||
| doi>10.1145/1835449.1835494 | ||
|
Full text: |
||
|
We present a novel framework for the query-performance prediction task. That is, estimating the effectiveness of a search performed in response to a query in lack of relevance judgments. Our approach is based on using statistical decision theory ...
expand
|
||
| Active learning for ranking through expected loss optimization | ||
| Bo Long, Olivier Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, Belle Tseng | ||
| Pages: 267-274 | ||
| doi>10.1145/1835449.1835495 | ||
|
Full text: |
||
|
Learning to rank arises in many information retrieval applications, ranging from Web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking model is strongly affected by the number of labeled examples ...
expand
|
||
| SESSION: Query representations & reformulations | ||
| Maarten de Rijke | ||
| Image search by concept map | ||
| Hao Xu, Jingdong Wang, Xian-Sheng Hua, Shipeng Li | ||
| Pages: 275-282 | ||
| doi>10.1145/1835449.1835497 | ||
|
Full text: |
||
|
In this paper, we present a novel image search system, image search by concept map. This system enables users to indicate not only what semantic concepts are expected to appear but also how these concepts are spatially distributed in the desired ...
expand
|
||
| Generalized syntactic and semantic models of query reformulation | ||
| Amac Herdagdelen, Massimiliano Ciaramita, Daniel Mahler, Maria Holmqvist, Keith Hall, Stefan Riezler, Enrique Alfonseca | ||
| Pages: 283-290 | ||
| doi>10.1145/1835449.1835498 | ||
|
Full text: |
||
|
We present a novel approach to query reformulation which combines syntactic and semantic information by means of generalized Levenshtein distance algorithms where the substitution operation costs are based on probabilistic term rewrite functions. We ...
expand
|
||
| Evaluating verbose query processing techniques | ||
| Samuel Huston, W. Bruce Croft | ||
| Pages: 291-298 | ||
| doi>10.1145/1835449.1835499 | ||
|
Full text: |
||
|
Verbose or long queries are a small but significant part of the query stream in web search, and are common in other applications such as collaborative question answering (CQA). Current search engines perform well with keyword queries but are not, in ...
expand
|
||
| SESSION: Automatic classification | ||
| Eric Gaussier | ||
| SED: supervised experimental design and its application to text classification | ||
| Yi Zhen, Dit-Yan Yeung | ||
| Pages: 299-306 | ||
| doi>10.1145/1835449.1835501 | ||
|
Full text: |
||
|
In recent years, active learning methods based on experimental design achieve state-of-the-art performance in text classification applications. Although these methods can exploit the distribution of unlabeled data and support batch selection, they cannot ...
expand
|
||
| Temporally-aware algorithms for document classification | ||
| Thiago Salles, Leonardo Rocha, Gisele L. Pappa, Fernando Mourão, Wagner Meira, Jr., Marcos Gonçalves | ||
| Pages: 307-314 | ||
| doi>10.1145/1835449.1835502 | ||
|
Full text: |
||
|
Automatic Document Classification (ADC) is still one of the major information retrieval problems. It usually employs a supervised learning strategy, where we first build a classification model using pre-classified documents and then use this model to ...
expand
|
||
| Multilabel classification with meta-level features | ||
| Siddharth Gopal, Yiming Yang | ||
| Pages: 315-322 | ||
| doi>10.1145/1835449.1835503 | ||
|
Full text: |
||
|
Effective learning in multi-label classification (MLC) requires an appropriate level of abstraction for representing the relationship between each instance and multiple categories. Current MLC methods have been focused on learning-to-map from instances ...
expand
|
||
| SESSION: Retrieval models and ranking | ||
| Djoerd Hiemstra | ||
| Estimation of statistical translation models based on mutual information for ad hoc information retrieval | ||
| Maryam Karimzadehgan, ChengXiang Zhai | ||
| Pages: 323-330 | ||
| doi>10.1145/1835449.1835505 | ||
|
Full text: |
||
|
As a principled approach to capturing semantic relations of words in information retrieval, statistical translation models have been shown to outperform simple document language models which rely on exact matching of words in the query and documents. ...
expand
|
||
| DivQ: diversification for keyword search over structured databases | ||
| Elena Demidova, Peter Fankhauser, Xuan Zhou, Wolfgang Nejdl | ||
| Pages: 331-338 | ||
| doi>10.1145/1835449.1835506 | ||
|
Full text: |
||
|
Keyword queries over structured databases are notoriously ambiguous. No single interpretation of a keyword query can satisfy all users, and multiple interpretations may yield overlapping results. This paper proposes a scheme to balance the relevance ...
expand
|
||
| Finding support sentences for entities | ||
| Roi Blanco, Hugo Zaragoza | ||
| Pages: 339-346 | ||
| doi>10.1145/1835449.1835507 | ||
|
Full text: |
||
|
We study the problem of finding sentences that explain the relationship between a named entity and an ad-hoc query, which we refer to as entity support sentences. This is an important sub-problem of entity ranking which, to the best of our knowledge, ...
expand
|
||
| Estimating probabilities for effective data fusion | ||
| David Lillis, Lusheng Zhang, Fergus Toolan, Rem W. Collier, David Leonard, John Dunnion | ||
| Pages: 347-354 | ||
| doi>10.1145/1835449.1835508 | ||
|
Full text: |
||
|
Data Fusion is the combination of a number of independent search results, relating to the same document collection, into a single result to be presented to the user. A number of probabilistic data fusion models have been shown to be effective in empirical ...
expand
|
||
| SESSION: User feedback & user models | ||
| Nicholas J. Belkin | ||
| Incorporating post-click behaviors into a click model | ||
| Feimin Zhong, Dong Wang, Gang Wang, Weizhu Chen, Yuchen Zhang, Zheng Chen, Haixun Wang | ||
| Pages: 355-362 | ||
| doi>10.1145/1835449.1835510 | ||
|
Full text: |
||
|
Much work has attempted to model a user's click-through behavior by mining the click logs. The task is not trivial due to the well-known position bias problem. Some break-throughs have been made: two newly proposed click models, DBN and CCM, addressed ...
expand
|
||
| Interactive retrieval based on faceted feedback | ||
| Lanbo Zhang, Yi Zhang | ||
| Pages: 363-370 | ||
| doi>10.1145/1835449.1835511 | ||
|
Full text: |
||
|
Motivated by the commonly used faceted search interface in e-commerce, this paper investigates interactive relevance feedback mechanism based on faceted document metadata. In this mechanism, the system recommends a group of document facet-value pairs, ...
expand
|
||
| A comparison of general vs personalised affective models for the prediction of topical relevance | ||
| Ioannis Arapakis, Konstantinos Athanasakos, Joemon M. Jose | ||
| Pages: 371-378 | ||
| doi>10.1145/1835449.1835512 | ||
|
Full text: |
||
|
Information retrieval systems face a number of challenges, originating mainly from the semantic gap problem. Implicit feedback techniques have been employed in the past to address many of these issues. Although this was a step towards the right direction, ...
expand
|
||
| Understanding web browsing behaviors through Weibull analysis of dwell time | ||
| Chao Liu, Ryen W. White, Susan Dumais | ||
| Pages: 379-386 | ||
| doi>10.1145/1835449.1835513 | ||
|
Full text: |
||
|
Dwell time on Web pages has been extensively used for various information retrieval tasks. However, some basic yet important questions have not been sufficiently addressed, eg, what distribution is appropriate to model the distribution of dwell ...
expand
|
||
| SESSION: Web IR and social media search | ||
| Hugo Zaragoza | ||
| Segmentation of multi-sentence questions: towards effective question retrieval in cQA services | ||
| Kai Wang, Zhao-Yan Ming, Xia Hu, Tat-Seng Chua | ||
| Pages: 387-394 | ||
| doi>10.1145/1835449.1835515 | ||
|
Full text: |
||
|
Existing question retrieval models work relatively well in finding similar questions in community-based question answering (cQA) services. However, they are designed for single-sentence queries or bag-of-word representations, and are not sufficient to ...
expand
|
||
| Mining the blogosphere for top news stories identification | ||
| Yeha Lee, Hun-young Jung, Woosang Song, Jong-Hyeok Lee | ||
| Pages: 395-402 | ||
| doi>10.1145/1835449.1835516 | ||
|
Full text: |
||
|
The analysis of query logs from blog search engines show that news-related queries occupy a significant portion of the logs. This raises a interesting research question on whether the blogosphere can be used to identify important news stories. In this ...
expand
|
||
| Proximity-based opinion retrieval | ||
| Shima Gerani, Mark James Carman, Fabio Crestani | ||
| Pages: 403-410 | ||
| doi>10.1145/1835449.1835517 | ||
|
Full text: |
||
|
Blog post opinion retrieval aims at finding blog posts that are relevant and opinionated about a user's query. In this paper we propose a simple probabilistic model for assigning relevant opinion scores to documents. The key problem is how to capture ...
expand
|
||
| Evaluating and predicting answer quality in community QA | ||
| Chirag Shah, Jefferey Pomerantz | ||
| Pages: 411-418 | ||
| doi>10.1145/1835449.1835518 | ||
|
Full text: |
||
|
Question answering (QA) helps one go beyond traditional keywords-based querying and retrieve information in more precise form than given by a document or a list of documents. Several community-based QA (CQA) services have emerged allowing information ...
expand
|
||
| SESSION: Document structure & adversarial information retrieval | ||
| Mounia Lalmas | ||
| Adaptive near-duplicate detection via similarity learning | ||
| Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz | ||
| Pages: 419-426 | ||
| doi>10.1145/1835449.1835520 | ||
|
Full text: |
||
|
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued sparse k-gram vector, where the weights are learned to optimize for ...
expand
|
||
| A content based approach for discovering missing anchor text for web search | ||
| Xing Yi, James Allan | ||
| Pages: 427-434 | ||
| doi>10.1145/1835449.1835521 | ||
|
Full text: |
||
|
Although anchor text provides very useful information for web search, a large portion of web pages have few or no incoming hyperlinks (anchors), which is known as the anchor text sparsity problem. In this paper, we propose a language modeling based technique ...
expand
|
||
| Uncovering social spammers: social honeypots + machine learning | ||
| Kyumin Lee, James Caverlee, Steve Webb | ||
| Pages: 435-442 | ||
| doi>10.1145/1835449.1835522 | ||
|
Full text: |
||
|
Web-based social systems enable new community-based opportunities for participants to engage, share, and interact. This community value and related services like search and advertising are threatened by spammers, content polluters, and malware disseminators. ...
expand
|
||
| SESSION: Users and interactive IR | ||
| David Carmel | ||
| Studying trailfinding algorithms for enhanced web search | ||
| Adish Singla, Ryen White, Jeff Huang | ||
| Pages: 443-450 | ||
| doi>10.1145/1835449.1835524 | ||
|
Full text: |
||
|
Search engines return ranked lists of Web pages in response to queries. These pages are starting points for post-query navigation, but may be insufficient for search tasks involving multiple steps. Search trails mined from toolbar logs start with a query ...
expand
|
||
| Context-aware ranking in web search | ||
| Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, Hang Li | ||
| Pages: 451-458 | ||
| doi>10.1145/1835449.1835525 | ||
|
Full text: |
||
|
The context of a search query often provides a search engine meaningful hints for answering the current query better. Previous studies on context-aware search were either focused on the development of context models or limited to a relatively small scale ...
expand
|
||
| Collecting high quality overlapping labels at low cost | ||
| Hui Yang, Anton Mityagin, Krysta M. Svore, Sergey Markov | ||
| Pages: 459-466 | ||
| doi>10.1145/1835449.1835526 | ||
|
Full text: |
||
|
This paper studies quality of human labels used to train search engines' rankers. Our specific focus is performance improvements obtained by using overlapping relevance labels, which is by collecting multiple human judgments for each training sample. ...
expand
|
||
| SESSION: Document representation and content analysis | ||
| Marie-Francine Moens | ||
| Multi-style language model for web scale information retrieval | ||
| Kuansan Wang, Xiaolong Li, Jianfeng Gao | ||
| Pages: 467-474 | ||
| doi>10.1145/1835449.1835528 | ||
|
Full text: |
||
|
Web documents are typically associated with many text streams, including the body, the title and the URL that are determined by the authors, and the anchor text or search queries used by others to refer to the documents. Through a systematic large scale ...
expand
|
||
| Combining coregularization and consensus-based self-training for multilingual text categorization | ||
| Massih R. Amini, Cyril Goutte, Nicolas Usunier | ||
| Pages: 475-482 | ||
| doi>10.1145/1835449.1835529 | ||
|
Full text: |
||
|
We investigate the problem of learning document classifiers in a multilingual setting, from collections where labels are only partially available. We address this problem in the framework of multiview learning, where different languages correspond to ...
expand
|
||
| Towards subjectifying text clustering | ||
| Sajib Dasgupta, Vincent Ng | ||
| Pages: 483-490 | ||
| doi>10.1145/1835449.1835530 | ||
|
Full text: |
||
|
Although it is common practice to produce only a single clustering of a dataset, in many cases text documents can be clustered along different dimensions. Unfortunately, not only do traditional text clustering algorithms fail to produce multiple clusterings ...
expand
|
||
| SESSION: Summarization & user feedback | ||
| Elizabeth D. Liddy | ||
| EUSUM: extracting easy-to-understand english summaries for non-native readers | ||
| Xiaojun Wan, Huiying Li, Jianguo Xiao | ||
| Pages: 491-498 | ||
| doi>10.1145/1835449.1835532 | ||
|
Full text: |
||
|
In this paper we investigate a novel and important problem in multi-document summarization, i.e., how to extract an easy-to-understand English summary for non-native readers. Existing summarization systems extract the same kind of English summaries from ...
expand
|
||
| Visual summarization of web pages | ||
| Binxing Jiao, Linjun Yang, Jizheng Xu, Feng Wu | ||
| Pages: 499-506 | ||
| doi>10.1145/1835449.1835533 | ||
|
Full text: |
||
|
Visual summarization is a attractive new scheme to summarize web pages, which can help achieve a more friendly user experience in search and re-finding tasks by allowing users quickly get the idea of what the web page is about and helping users recall ...
expand
|
||
| Learning more powerful test statistics for click-based retrieval evaluation | ||
| Yisong Yue, Yue Gao, Oliver Chapelle, Ya Zhang, Thorsten Joachims | ||
| Pages: 507-514 | ||
| doi>10.1145/1835449.1835534 | ||
|
Full text: |
||
|
Interleaving experiments are an attractive methodology for evaluating retrieval functions through implicit feedback. Designed as a blind and unbiased test for eliciting a preference between two retrieval functions, an interleaved ranking of the results ...
expand
|
||
| SESSION: Query log analysis | ||
| Yoelle Maarek | ||
| Query similarity by projecting the query-flow graph | ||
| Ilaria Bordino, Carlos Castillo, Debora Donato, Aristides Gionis | ||
| Pages: 515-522 | ||
| doi>10.1145/1835449.1835536 | ||
|
Full text: |
||
|
Defining a measure of similarity between queries is an interesting and difficult problem. A reliable query-similarity measure can be used in a variety of applications such as query recommendation, query expansion, and advertising. In this paper, we exploit ...
expand
|
||
| The demographics of web search | ||
| Ingmar Weber, Carlos Castillo | ||
| Pages: 523-530 | ||
| doi>10.1145/1835449.1835537 | ||
|
Full text: |
||
|
How does the web search behavior of "rich" and "poor" people differ? Do men and women tend to click on difffferent results for the same query? What are some queries almost exclusively issued by African Americans? These are some of the questions we address ...
expand
|
||
| A user behavior model for average precision and its generalization to graded judgments | ||
| Georges Dupret, Benjamin Piwowarski | ||
| Pages: 531-538 | ||
| doi>10.1145/1835449.1835538 | ||
|
Full text: |
||
|
We explore a set of hypothesis on user behavior that are potentially at the origin of the (Mean) Average Precision (AP) metric. This allows us to propose a more realistic version of AP where users click non-deterministically on relevant documents and ...
expand
|
||
| SESSION: Test-collections | ||
| John Tait | ||
| The effect of assessor error on IR system evaluation | ||
| Ben Carterette, Ian Soboroff | ||
| Pages: 539-546 | ||
| doi>10.1145/1835449.1835540 | ||
|
Full text: |
||
|
Recent efforts in test collection building have focused on scaling back the number of necessary relevance judgments and then scaling up the number of search topics. Since the largest source of variation in a Cranfield-style experiment comes from the ...
expand
|
||
| Reusable test collections through experimental design | ||
| Ben Carterette, Evangelos Kanoulas, Virgil Pavlu, Hui Fang | ||
| Pages: 547-554 | ||
| doi>10.1145/1835449.1835541 | ||
|
Full text: |
||
|
Portable, reusable test collections are a vital part of research and development in information retrieval. Reusability is difficult to assess, however. The standard approach--simulating judgment collection when groups of systems are held out, then evaluating ...
expand
|
||
| Do user preferences and evaluation measures line up? | ||
| Mark Sanderson, Monica Lestari Paramita, Paul Clough, Evangelos Kanoulas | ||
| Pages: 555-562 | ||
| doi>10.1145/1835449.1835542 | ||
|
Full text: |
||
|
This paper presents results comparing user preference for search engine rankings with measures of effectiveness computed from a test collection. It establishes that preferences and evaluation measures correlate: systems measured as better on a test collection ...
expand
|
||
| SESSION: Query analysis | ||
| Ricardo Baeza-Yates | ||
| Estimating advertisability of tail queries for sponsored search | ||
| Sandeep Pandey, Kunal Punera, Marcus Fontoura, Vanja Josifovski | ||
| Pages: 563-570 | ||
| doi>10.1145/1835449.1835544 | ||
|
Full text: |
||
|
Sponsored search is one of the major sources of revenue for search engines on the World Wide Web. It has been observed that while showing ads for every query maximizes short-term revenue, irrelevant ads lead to poor user experience and less revenue in ...
expand
|
||
| Exploring reductions for long web queries | ||
| Niranjan Balasubramanian, Giridhar Kumaran, Vitor R. Carvalho | ||
| Pages: 571-578 | ||
| doi>10.1145/1835449.1835545 | ||
|
Full text: |
||
|
Long queries form a difficult, but increasingly important segment for web search engines. Query reduction, a technique for dropping unnecessary query terms from long queries, improves performance of ad-hoc retrieval on TREC collections. Also, it has ...
expand
|
||
| Positional relevance model for pseudo-relevance feedback | ||
| Yuanhua Lv, ChengXiang Zhai | ||
| Pages: 579-586 | ||
| doi>10.1145/1835449.1835546 | ||
|
Full text: |
||
|
Pseudo-relevance feedback is an effective technique for improving retrieval results. Traditional feedback algorithms use a whole feedback document as a unit to extract words for query expansion, which is not optimal as a document may cover several different ...
expand
|
||
| SESSION: Effectiveness measures | ||
| Ian Soboroff | ||
| Assessing the scenic route: measuring the value of search trails in web logs | ||
| Ryen W. White, Jeff Huang | ||
| Pages: 587-594 | ||
| doi>10.1145/1835449.1835548 | ||
|
Full text: |
||
|
Search trails mined from browser or toolbar logs comprise queries and the post-query pages that users visit. Implicit endorsements from many trails can be useful for search result ranking, where the presence of a page on a trail increases its query relevance. ...
expand
|
||
| Human performance and retrieval precision revisited | ||
| Mark D. Smucker, Chandra Prakash Jethani | ||
| Pages: 595-602 | ||
| doi>10.1145/1835449.1835549 | ||
|
Full text: |
||
|
Several studies have found that the Cranfield approach to evaluation can report significant performance differences between retrieval systems for which little to no performance difference is found for humans completing tasks with these systems. We revisit ...
expand
|
||
| Extending average precision to graded relevance judgments | ||
| Stephen E. Robertson, Evangelos Kanoulas, Emine Yilmaz | ||
| Pages: 603-610 | ||
| doi>10.1145/1835449.1835550 | ||
|
Full text: |
||
|
Evaluation metrics play a critical role both in the context of comparative evaluation of the performance of retrieval systems and in the context of learning-to-rank (LTR) as objective functions to be optimized. Many different evaluation metrics have ...
expand
|
||
| PRES: a score metric for evaluating recall-oriented information retrieval applications | ||
| Walid Magdy, Gareth J.F. Jones | ||
| Pages: 611-618 | ||
| doi>10.1145/1835449.1835551 | ||
|
Full text: |
||
|
Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused ...
expand
|
||
| SESSION: Multimedia information retrieval | ||
| Tat Seng Chua | ||
| Content-enriched classifier for web video classification | ||
| Bin Cui, Ce Zhang, Gao Cong | ||
| Pages: 619-626 | ||
| doi>10.1145/1835449.1835553 | ||
|
Full text: |
||
|
With the explosive growth of online videos, automatic real-time categorization of Web videos plays a key role for organizing, browsing and retrieving the huge amount of videos on the Web. Previous work shows that, in addition to text features, content ...
expand
|
||
| Robust audio identification for MP3 popular music | ||
| Wei Li, Yaduo Liu, Xiangyang Xue | ||
| Pages: 627-634 | ||
| doi>10.1145/1835449.1835554 | ||
|
Full text: |
||
|
Audio identification via fingerprint has been an active research field with wide applications for years. Many technical papers were published and commercial software systems were also employed. However, most of these previously reported methods work ...
expand
|
||
| Effective music tagging through advanced statistical modeling | ||
| Jialie Shen, Wang Meng, Shuichang Yan, HweeHwa Pang, Xiansheng Hua | ||
| Pages: 635-642 | ||
| doi>10.1145/1835449.1835555 | ||
|
Full text: |
||
|
Music information retrieval (MIR) holds great promise as a technology for managing large music archives. One of the key components of MIR that has been actively researched into is music tagging. While significant progress has been achieved, most of the ...
expand
|
||
| Properties of optimally weighted data fusion in CBMIR | ||
| Peter Wilkins, Alan F. Smeaton, Paul Ferguson | ||
| Pages: 643-650 | ||
| doi>10.1145/1835449.1835556 | ||
|
Full text: |
||
|
Content-Based Multimedia Information Retrieval (CBMIR) systems which leverage multiple retrieval experts (En) often employ a weighting scheme when combining expert results through data fusion. Typically however a query will comprise ...
expand
|
||
| SESSION: Non-english IR & evaluation | ||
| Jaana Kekäläinen | ||
| To translate or not to translate? | ||
| Chia-Jung Lee, Chin-Hui Chen, Shao-Hang Kao, Pu-Jen Cheng | ||
| Pages: 651-658 | ||
| doi>10.1145/1835449.1835558 | ||
|
Full text: |
||
|
Query translation is an important task in cross-language information retrieval (CLIR) aiming to translate queries into languages used in documents. The purpose of this paper is to investigate the necessity of translating query terms, which might differ ...
expand
|
||
| Multilingual PRF: english lends a helping hand | ||
| Manoj K. Chinnakotla, Karthik Raman, Pushpak Bhattacharyya | ||
| Pages: 659-666 | ||
| doi>10.1145/1835449.1835559 | ||
|
Full text: |
||
|
In this paper, we present a novel approach to Pseudo-Relevance Feedback (PRF) called Multilingual PRF (MultiPRF). The key idea is to harness multilinguality. Given a query in a language, we take the help of another language to ameliorate the well known ...
expand
|
||
| Comparing the sensitivity of information retrieval metrics | ||
| Filip Radlinski, Nick Craswell | ||
| Pages: 667-674 | ||
| doi>10.1145/1835449.1835560 | ||
|
Full text: |
||
|
Information retrieval effectiveness is usually evaluated using measures such as Normalized Discounted Cumulative Gain (NDCG), Mean Average Precision (MAP) and Precision at some cutoff (Precision@k) on a set of judged queries. Recent research has suggested ...
expand
|
||
| SESSION: Applications II | ||
| David D. Lewis | ||
| Efficient partial-duplicate detection based on sequence matching | ||
| Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang | ||
| Pages: 675-682 | ||
| doi>10.1145/1835449.1835562 | ||
|
Full text: |
||
|
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since partial-duplicates only contain a small piece of text taken from other sources ...
expand
|
||
| Discriminative models of integrating document evidence and document-candidate associations for expert search | ||
| Yi Fang, Luo Si, Aditya P. Mathur | ||
| Pages: 683-690 | ||
| doi>10.1145/1835449.1835563 | ||
|
Full text: |
||
|
Generative models such as statistical language modeling have been widely studied in the task of expert search to model the relationship between experts and their expertise indicated in supporting documents. On the other hand, discriminative models have ...
expand
|
||
| Vertical selection in the presence of unlabeled verticals | ||
| Jaime Arguello, Fernando Diaz, Jean-François Paiement | ||
| Pages: 691-698 | ||
| doi>10.1145/1835449.1835564 | ||
|
Full text: |
||
|
Vertical aggregation is the task of incorporating results from specialized search engines or verticals (e.g., images, video, news) into Web search results. Vertical selection is the subtask of deciding, given a query, which verticals, if any, are relevant. ...
expand
|
||
| DEMONSTRATION SESSION: Demonstrations | ||
| iCollaborate: harvesting value from enterprise web usage | ||
| Ajinkya Kale, Thomas Burris, Bhavesh Shah, T L Prasanna Venkatesan, Lakshmanan Velusamy, Manish Gupta, Melania Degerattu | ||
| Pages: 699-699 | ||
| doi>10.1145/1835449.1835566 | ||
|
Full text: |
||
|
We are in a phase of 'Participatory Web' in which users add value' to the information on the web by publishing, tagging and sharing. The Participatory Web has enormous potential for an enterprise because unlike the users of the internet an enterprise ...
expand
|
||
| Exploring desktop resources based on user activity analysis | ||
| Yukun Li, Xiangyu Zhang, Xiaofeng Meng | ||
| Pages: 700-700 | ||
| doi>10.1145/1835449.1835567 | ||
|
Full text: |
||
|
Relocation in personal desktop resources is an interesting and promising research topic. This demonstration illustrates a new perspective in exploring desktop resources to help users re-find expected data resources more effectively. Different from existing ...
expand
|
||
| A data-parallel toolkit for information retrieval | ||
| Dennis Fetterly, Frank McSherry | ||
| Pages: 701-701 | ||
| doi>10.1145/1835449.1835568 | ||
|
Full text: |
||
| Finding and filtering information for children | ||
| Desmond Elliot, Richard Glassey, Tamara Polajnar, Leif Azzopardi | ||
| Pages: 702-702 | ||
| doi>10.1145/1835449.1835569 | ||
|
Full text: |
||
|
Children face several challenges when using information access systems. These include formulating queries, judging the relevance of documents, and focusing attention on interface cues, such as query suggestions, while typing queries. It has also been ...
expand
|
||
| Automatic content linking: speech-based just-in-time retrieval for multimedia archives | ||
| Andrei Popescu-Belis, Jonathan Kilgour, Peter Poller, Alexandre Nanchen, Erik Boertjes, Joost de Wit | ||
| Pages: 703-703 | ||
| doi>10.1145/1835449.1835570 | ||
|
Full text: |
||
|
The Automatic Content Linking Device monitors a conversation and uses automatically recognized words to retrieve documents that are of potential use to the participants. The document set includes project related reports or emails, transcribed snippets ...
expand
|
||
| Si-Fi: interactive similar item finder | ||
| Inbeom Hwang, Minsuk Kahng, Sung Eun Park, Jinwook Seo, Sang-goo Lee | ||
| Pages: 704-704 | ||
| doi>10.1145/1835449.1835571 | ||
|
Full text: |
||
| Suggesting related topics in web search | ||
| Santosh Raju, Shaishav Kumar, Raghavendra Udupa | ||
| Pages: 705-705 | ||
| doi>10.1145/1835449.1835572 | ||
|
Full text: |
||
|
Suggesting topics that are related to user's goal or interest is very important in web search. However, search engines today focus on suggesting mainly reformulations and lexical variants of the query mined from query logs. In this demonstration, we ...
expand
|
||
| Agro-Gator: digesting experts, logs, and N-grams | ||
| Michael Huggett | ||
| Pages: 706-706 | ||
| doi>10.1145/1835449.1835573 | ||
|
Full text: |
||
|
As research includes more and larger user studies, a significant problem lies in combining the many types of data files into a single table suitable for analysis by common statistical tools. We have developed a data-aggregation tool that combines user ...
expand
|
||
| Medical search and classification tools for recommendation | ||
| Jimmy Xiangji Huang, Aijun An, Qinmin Hu | ||
| Pages: 707-707 | ||
| doi>10.1145/1835449.1835574 | ||
|
Full text: |
||
|
their patients' records from paper to computer, enormous amounts of electronic medical records (EMR) have become available for medical research. Some of the EMR data are well-structured, for which traditional database management systems can provide effective ...
expand
|
||
| Multilingual people search | ||
| Shaishav Kumar, Raghavendra Udupa | ||
| Pages: 708-708 | ||
| doi>10.1145/1835449.1835575 | ||
|
Full text: |
||
|
People Search is an important search service with multiple applications (eg. looking up a friend on Facebook, finding colleagues in corporate email directories etc). With the proportion of non-English users on a steady rise, people search services are ...
expand
|
||
| POSTER SESSION: Poster presentations | ||
| Closed form solution of similarity algorithms | ||
| Yuanzhe Cai, Miao Zhang, Chris Ding, Sharma Chakravarthy | ||
| Pages: 709-710 | ||
| doi>10.1145/1835449.1835577 | ||
|
Full text: |
||
|
Algorithms defining similarities between objects of an information network are important of many IR tasks. SimRank algorithm and its variations are popularly used in many applications. Many fast algorithms are also developed. In this note, we first reformulate ...
expand
|
||
| Blog snippets: a comments-biased approach | ||
| Javier Parapar, Jorge López-Castro, Álvaro Barreiro | ||
| Pages: 711-712 | ||
| doi>10.1145/1835449.1835578 | ||
|
Full text: |
||
|
In the last years Blog Search has been a new exciting task in Information Retrieval. The presence of user generated information with valuable opinions makes this field of huge interest. In this poster we use part of this information, the readers' comments, ...
expand
|
||
| SIGIR: scholar vs. scholars' interpretation | ||
| James Lanagan, Alan F. Smeaton | ||
| Pages: 713-714 | ||
| doi>10.1145/1835449.1835579 | ||
|
Full text: |
||
|
Google Scholar allows researchers to search through a free and extensive source of information on scientific publications. In this paper we show that within the limited context of SIGIR proceedings, the rankings created by Google Scholar are both significantly ...
expand
|
||
| Effective query expansion with the resistance distance based term similarity metric | ||
| Shuguang Wang, Milos Hauskrecht | ||
| Pages: 715-716 | ||
| doi>10.1145/1835449.1835580 | ||
|
Full text: |
||
|
In this paper, we define a new query expansion method that relies on term similarity metric derived from the electric resistance network. This proposed metric lets us measure the mutual relevancy in between terms and between their groups. This paper ...
expand
|
||
| A method to automatically construct a user knowledge model in a forum environment | ||
| Ahmad Kardan, Mehdi Garakani, Bamdad Bahrani | ||
| Pages: 717-718 | ||
| doi>10.1145/1835449.1835581 | ||
|
Full text: |
||
|
Having a mechanism to validate the opinions and to identify experts in a forum could help people to favor one opinion against another. To achieve this, some solutions have already been introduced, including social network analysis techniques and reputation ...
expand
|
||
| Learning to rank audience for behavioral targeting | ||
| Ning Liu, Jun Yan, Dou Shen, Depin Chen, Zheng Chen, Ying Li | ||
| Pages: 719-720 | ||
| doi>10.1145/1835449.1835582 | ||
|
Full text: |
||
|
Behavioral Targeting (BT) is a recent trend of online advertising market. However, some classical BT solutions, which predefine the user segments for BT ads delivery, are sometimes too large to numerous long-tail advertisers, who cannot afford to buy ...
expand
|
||
| Multi-modal query expansion for web video search | ||
| Bailan Feng, Juan Cao, Zhineng Chen, Yongdong Zhang, Shouxun Lin | ||
| Pages: 721-722 | ||
| doi>10.1145/1835449.1835583 | ||
|
Full text: |
||
|
Query expansion is an effective method to improve the usability of multimedia search. Most existing multimedia search engines are able to automatically expand a list of textual query terms based on text search techniques, which can be called textual ...
expand
|
||
| Context aware query classification using dynamic query window and relationship net | ||
| Nazli Goharian, Saket S.R. Mengle | ||
| Pages: 723-724 | ||
| doi>10.1145/1835449.1835584 | ||
|
Full text: |
||
|
The context of the user queries, preceding a given query, is utilized to improve the effectiveness of query classification. Earlier efforts utilize fixed number of preceding queries to derive such context information. We propose and evaluate an approach ...
expand
|
||
| Predicting query potential for personalization, classification or regression? | ||
| Chen Chen, Muyun Yang, Sheng Li, Tiejun Zhao, Haoliang Qi | ||
| Pages: 725-726 | ||
| doi>10.1145/1835449.1835585 | ||
|
Full text: |
||
|
The goal of predicting query potential for personalization is to determine which queries can benefit from personalization. In this paper, we investigate which kind of strategy is better for this task: classification or regression. We quantify the potential ...
expand
|
||
| The impact of collection size on relevance and diversity | ||
| Marijn Koolen, Jaap Kamps | ||
| Pages: 727-728 | ||
| doi>10.1145/1835449.1835586 | ||
|
Full text: |
||
|
It has been observed that precision increases with collection size. One explanation could be that the redundancy of information increases, making it easier to find multiple documents conveying the same information. Arguably, a user has no interest in ...
expand
|
||
| Spatial relationships in visual graph modeling for image categorization | ||
| Trong-Ton Pham, Philippe Mulhem, Loic Maisonnasse | ||
| Pages: 729-730 | ||
| doi>10.1145/1835449.1835587 | ||
|
Full text: |
||
|
In this paper, a language model adapted to graph-based representation of image content is proposed and assessed. The full indexing and retrieval processes are evaluated on two different image corpora. We show that using the spatial relationships with ...
expand
|
||
| A picture is worth a thousand search results: finding child-oriented multimedia results with collAge | ||
| Karl Gyllstrom, Marie-Francine Moens | ||
| Pages: 731-732 | ||
| doi>10.1145/1835449.1835588 | ||
|
Full text: |
||
|
We present a simple and effective approach to complement search results for children's web queries with child-oriented multimedia results, such as coloring pages and music sheets. Our approach determines appropriate media types for a query by searching ...
expand
|
||
| Query recovery of short user queries: on query expansion with stopwords | ||
| Johannes Leveling, Gareth J.F. Jones | ||
| Pages: 733-734 | ||
| doi>10.1145/1835449.1835589 | ||
|
Full text: |
||
|
User queries to search engines are observed to predominantly contain inflected content words but lack stopwords and capitalization. Thus, they often resemble natural language queries after case folding and stopword removal. Query recovery aims to generate ...
expand
|
||
| Where to start filtering redundancy?: a cluster-based approach | ||
| Ronald T. Fernandez, Javier Parapar, David E. Losada, Alvaro Barreiro | ||
| Pages: 735-736 | ||
| doi>10.1145/1835449.1835590 | ||
|
Full text: |
||
|
Novelty detection is a difficult task, particularly at sentence level. Most of the approaches proposed in the past consist of re-ordering all sentences following their novelty scores. However, this re-ordering has usually little value. In fact, a naive ...
expand
|
||
| Flickr group recommendation based on tensor decomposition | ||
| Nan Zheng, Qiudan Li, Shengcai Liao, Leiming Zhang | ||
| Pages: 737-738 | ||
| doi>10.1145/1835449.1835591 | ||
|
Full text: |
||
|
Over the last few years, Flickr has gained massive popularity and groups in Flickr are one of the main ways for photo diffusion. However, the huge volume of groups brings troubles for users to decide which group to choose. In this paper, we propose a ...
expand
|
||
| Robust music identification based on low-order zernike moment in the compressed domain | ||
| Wei Li, Yaduo Liu, Xiangyang Xue | ||
| Pages: 739-740 | ||
| doi>10.1145/1835449.1835592 | ||
|
Full text: |
||
|
In this paper, we devise a novel robust music identification algorithm utilizing compressed-domain audio Zernike moment adapted from image processing techniques as the pivotal feature. Audio fingerprint derived from this feature exhibits strong robustness ...
expand
|
||
| Estimating interference in the QPRP for subtopic retrieval | ||
| Guido Zuccon, Leif Azzopardi, Claudia Hauff, C.J. Keith van Rijsbergen | ||
| Pages: 741-742 | ||
| doi>10.1145/1835449.1835593 | ||
|
Full text: |
||
|
The Quantum Probability Ranking Principle (QPRP) has been recently proposed, and accounts for interdependent document relevance when ranking. However, to be instantiated, the QPRP requires a method to approximate the "interference" between two documents. ...
expand
|
||
| Query quality: user ratings and system predictions | ||
| Claudia Hauff, Franciska de Jong, Diane Kelly, Leif Azzopardi | ||
| Pages: 743-744 | ||
| doi>10.1145/1835449.1835594 | ||
|
Full text: |
||
|
Numerous studies have examined the ability of query performance prediction methods to estimate a query's quality for system effectiveness measures (such as average precision). However, little work has explored the relationship between these methods and ...
expand
|
||
| Multi-field learning for email spam filtering | ||
| Wuying Liu, Ting Wang | ||
| Pages: 745-746 | ||
| doi>10.1145/1835449.1835595 | ||
|
Full text: |
||
|
Through the investigation of email document structure, this paper proposes a multi-field learning (MFL) framework, which breaks the multi-field document Text Classification (TC) problem into several sub-document TC problems, and makes the final category ...
expand
|
||
| Language-model-based pro/con classification of political text | ||
| Rawia Awadallah, Maya Ramanath, Gerhard Weikum | ||
| Pages: 747-748 | ||
| doi>10.1145/1835449.1835596 | ||
|
Full text: |
||
|
Given a controversial political topic, our aim is to classify documents debating the topic into pro or con. Our approach extracts topic related terms, pro/con related terms, and pairs of topic related and pro/con related terms and uses them as the basis ...
expand
|
||
| Intent boundary detection in search query logs | ||
| Chieh-Jen Wang, Kevin Hsin-Yih Lin, Hsin-Hsi Chen | ||
| Pages: 749-750 | ||
| doi>10.1145/1835449.1835597 | ||
|
Full text: |
||
|
Identifying intent boundary in search query logs is important for learning users' behaviors and applying their experiences. Time-based, query-based, and cluster-based approaches are proposed. Experiments show that the integration of intent clusters and ...
expand
|
||
| Semi-supervised spam filtering using aggressive consistency learning | ||
| Mona Mojdeh, Gordon V. Cormack | ||
| Pages: 751-752 | ||
| doi>10.1145/1835449.1835598 | ||
|
Full text: |
||
|
A graph based semi-supervised method for email spam filtering, based on the local and global consistency method, yields low error rates with very few labeled examples. The motivating application of this method is spam filters with access to very few ...
expand
|
||
| Entropy descriptor for image classification | ||
| Hongyu Li, Junyu Niu, Jiachen Chen, Huibo Liu | ||
| Pages: 753-754 | ||
| doi>10.1145/1835449.1835599 | ||
|
Full text: |
||
|
This paper presents a novel entropy descriptor in the sense of geometric manifolds. With this descriptor, entropy cycles can be easily designed for image classification. Minimizing this entropy leads to an optimal entropy cycle where images are connected ...
expand
|
||
| Has portfolio theory got any principles? | ||
| Guido Zuccon, Leif Azzopardi, C.J. "Keith" van Rijsbergen | ||
| Pages: 755-756 | ||
| doi>10.1145/1835449.1835600 | ||
|
Full text: |
||
|
Recently, Portfolio Theory (PT) has been proposed for Information Retrieval. However, under non-trivial conditions PT violates the original Probability Ranking Principle (PRP). In this poster, we shall explore whether PT upholds a different ranking principle ...
expand
|
||
| Re-examination on lam% in spam filtering | ||
| Haoliang Qi, Muyun Yang, Xiaoning He, Sheng Li | ||
| Pages: 757-758 | ||
| doi>10.1145/1835449.1835601 | ||
|
Full text: |
||
|
Logistic average misclassification percentage (lam%) is a key measure for the spam filtering performance. This paper demonstrates that a spam filter can achieve a perfect 0.00% in lam%, the minimal value in theory, by simply setting a biased threshold ...
expand
|
||
| Unsupervised estimation of dirichlet smoothing parameters | ||
| Jangwon Seo, W. Bruce Croft | ||
| Pages: 759-760 | ||
| doi>10.1145/1835449.1835602 | ||
|
Full text: |
||
|
A standard approach for determining a Dirichlet smoothing parameter is to choose a value which maximizes a retrieval performance metric using training data consisting of queries and relevance judgments. There are, however, situations where training data ...
expand
|
||
| Comparing click-through data to purchase decisions for retrieval evaluation | ||
| Katja Hofmann, Bouke Huurnink, Marc Bron, Maarten de Rijke | ||
| Pages: 761-762 | ||
| doi>10.1145/1835449.1835603 | ||
|
Full text: |
||
|
Traditional retrieval evaluation uses explicit relevance judgments which are expensive to collect. Relevance assessments inferred from implicit feedback such as click-through data can be collected inexpensively, but may be less reliable. We compare assessments ...
expand
|
||
| Personalize web search results with user's location | ||
| Yumao Lu, Fuchun Peng, Xing Wei, Benoit Dumoulin | ||
| Pages: 763-764 | ||
| doi>10.1145/1835449.1835604 | ||
|
Full text: |
||
|
We build a probabilistic model to identify implicit local intent queries, and leverage user's physical location to improve Web search results for these queries. Evaluation on commercial search engine shows significant improvement on search relevance ...
expand
|
||
| Using search session context for named entity recognition in query | ||
| Junwu Du, Zhimin Zhang, Jun Yan, Yan Cui, Zheng Chen | ||
| Pages: 765-766 | ||
| doi>10.1145/1835449.1835605 | ||
|
Full text: |
||
|
Recently, the problem of Named Entity Recognition in Query (NERQ) is attracting increasingly attention in the field of information retrieval. However, the lack of context information in short queries makes some classical named entity recognition (NER) ...
expand
|
||
| Evaluating whole-page relevance | ||
| Peter Bailey, Nick Craswell, Ryen W. White, Liwei Chen, Ashwin Satyanarayana, S.M.M. Tahaghoghi | ||
| Pages: 767-768 | ||
| doi>10.1145/1835449.1835606 | ||
|
Full text: |
||
|
Whole page relevance defines how well the surface-level repre-sentation of all elements on a search result page and the corre-sponding holistic attributes of the presentation respond to users' information needs. We introduce a method for evaluating the ...
expand
|
||
| Predicting escalations of medical queries based on web page structure and content | ||
| Ryen W. White, Eric Horvitz | ||
| Pages: 769-770 | ||
| doi>10.1145/1835449.1835607 | ||
|
Full text: |
||
|
Logs of users' searches on Web health topics can exhibit signs of escalation of medical concerns, where initial queries about common symptoms are followed by queries about serious, rare illnesses. We present an effort to predict such escalations based ...
expand
|
||
| Contextual video advertising system using scene information inferred from video scripts | ||
| Bong-Jun Yi, Jung-Tae Lee, Hyun-Wook Woo, Hae-Chang Rim | ||
| Pages: 771-772 | ||
| doi>10.1145/1835449.1835608 | ||
|
Full text: |
||
|
With the rise of digital video consumptions, contextual video advertising demands have been increasing in recent years. This paper presents a novel video advertising system that selects relevant text ads for a given video scene by automatically identifying ...
expand
|
||
| Cross-language retrieval using link-based language models | ||
| Benjamin Roth, Dietrich Klakow | ||
| Pages: 773-774 | ||
| doi>10.1145/1835449.1835609 | ||
|
Full text: |
||
|
We propose a cross-language retrieval model that is solely based on Wikipedia as a training corpus. The main contributions of our work are: 1. A translation model based on linked text in Wikipedia and a term weighting method associated with it. 2. A ...
expand
|
||
| Search system requirements of patent analysts | ||
| Leif Azzopardi, Wim Vanderbauwhede, Hideo Joho | ||
| Pages: 775-776 | ||
| doi>10.1145/1835449.1835610 | ||
|
Full text: |
||
|
Patent search tasks are difficult and challenging, often requiring expert patent analysts to spend hours, even days, sourcing relevant information. To aid them in this process, analysts use Information Retrieval systems and tools to cope with their retrieval ...
expand
|
||
| On performance of topical opinion retrieval | ||
| Giambattista Amati, Giuseppe Amodeo, Valerio Capozio, Carlo Gaibisso, Giorgio Gambosi | ||
| Pages: 777-778 | ||
| doi>10.1145/1835449.1835611 | ||
|
Full text: |
||
|
We investigate the effectiveness of both the standard evaluation measures and the opinion component for topical opinion retrieval. We analyze how relevance is affected by opinions by perturbing relevance ranking by the outcomes of opinion-only classifiers ...
expand
|
||
| Improving sentence retrieval with an importance prior | ||
| Leif Azzopardi, Ronald T. Fernández, David E. Losada | ||
| Pages: 779-780 | ||
| doi>10.1145/1835449.1835612 | ||
|
Full text: |
||
|
The retrieval of sentences is a core task within Information Retrieval. In this poster we employ a Language Model that incorporates a prior which encodes the importance of sentences within the retrieval model. Then, in a set of comprehensive experiments ...
expand
|
||
| Focused access to sparsely and densely relevant documents | ||
| Paavo Arvola, Jaana Kekäläinen, Marko Junkkari | ||
| Pages: 781-782 | ||
| doi>10.1145/1835449.1835613 | ||
|
Full text: |
||
|
XML retrieval provides a focused access to the relevant content of documents. However, in evaluation, full document retrieval has appeared competitive to focused XML retrieval. We analyze the density of relevance in documents, and show that in sparsely ...
expand
|
||
| Text document clustering with metric learning | ||
| Jinlong Wang, Shunyao Wu, Huy Quan Vu, Gang Li | ||
| Pages: 783-784 | ||
| doi>10.1145/1835449.1835614 | ||
|
Full text: |
||
|
One reason for semi-supervised clustering fail to deliver satisfactory performance in document clustering is that the transformed optimization problem could have many candidate solutions, but existing methods provide no mechanism to select a suitable ...
expand
|
||
| Predicting query performance on the web | ||
| Niranjan Balasubramanian, Giridhar Kumaran, Vitor R. Carvalho | ||
| Pages: 785-786 | ||
| doi>10.1145/1835449.1835615 | ||
|
Full text: |
||
|
Predicting the performance of web queries is useful for several applications such as automatic query reformulation and automatic spell correction. In the web environment, accurate performance prediction is challenging because measures such as clarity ...
expand
|
||
| Hashtag retrieval in a microblogging environment | ||
| Miles Efron | ||
| Pages: 787-788 | ||
| doi>10.1145/1835449.1835616 | ||
|
Full text: |
||
|
Microblog services let users broadcast brief textual messages to people who "follow" their activity. Often these posts contain terms called hashtags, markers of a post's meaning, audience, etc. This poster treats the following problem: given a user's ...
expand
|
||
| Crowdsourcing a wikipedia vandalism corpus | ||
| Martin Potthast | ||
| Pages: 789-790 | ||
| doi>10.1145/1835449.1835617 | ||
|
Full text: |
||
|
We report on the construction of the PAN Wikipedia vandalism corpus, PAN-WVC-10, using Amazon's Mechanical Turk. The corpus compiles 32452 edits on 28468 Wikipedia articles, among which 2391 vandalism edits have been identified. 753 human annotators ...
expand
|
||
| MEMOSE: search engine for emotions in multimedia documents | ||
| Kathrin Knautz, Tobias Siebenlist, Wolfgang G. Stock | ||
| Pages: 791-792 | ||
| doi>10.1145/1835449.1835618 | ||
|
Full text: |
||
|
The MEMOSE (Media Emotion Search) system is a specialized search engine for fundamental emotions in all kinds of emotional-laden documents. We apply a controlled vocabulary for basic emotions, a slide control to adjust the intensities of the emotions ...
expand
|
||
| Hierarchical pitman-yor language model for information retrieval | ||
| Saeedeh Momtazi, Dietrich Klakow | ||
| Pages: 793-794 | ||
| doi>10.1145/1835449.1835619 | ||
|
Full text: |
||
|
In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distribution. The Pitman-Yor process creates a power-law distribution which ...
expand
|
||
| Entity summarization of news articles | ||
| Gianluca Demartini, Malik Muhammad Saad Missen, Roi Blanco, Hugo Zaragoza | ||
| Pages: 795-796 | ||
| doi>10.1145/1835449.1835620 | ||
|
Full text: |
||
|
In this paper we study the problem of entity retrieval for news applications and the importance of the news trail history (i.e. past related articles) to determine the relevant entities in current articles. We construct a novel entity-labeled corpus ...
expand
|
||
| The power of naive query segmentation | ||
| Matthias Hagen, Martin Potthast, Benno Stein, Christof Braeutigam | ||
| Pages: 797-798 | ||
| doi>10.1145/1835449.1835621 | ||
|
Full text: |
||
|
We address the problem of query segmentation: given a keyword query submitted to a search engine, the task is to group the keywords into phrases, if possible. Previous approaches to the problem achieve good segmentation performance on a gold standard ...
expand
|
||
| Clicked phrase document expansion for sponsored search ad retrieval | ||
| Dustin Hillard, Chris Leggetter | ||
| Pages: 799-800 | ||
| doi>10.1145/1835449.1835622 | ||
|
Full text: |
||
|
We present a document expansion approach that uses Conditional Random Field (CRF) segmentation to automatically extract salient phrases from ad titles. We then supplement the ad document with query segments that are probable translations of the document ...
expand
|
||
| Three web-based heuristics to determine a person's or institution's country of origin | ||
| Markus Schedl, Klaus Seyerlehner, Dominik Schnitzer, Gerhard Widmer, Cornelia Schiketanz | ||
| Pages: 801-802 | ||
| doi>10.1145/1835449.1835623 | ||
|
Full text: |
||
|
We propose three heuristics to determine the country of origin of a person or institution via text-based IE from the Web. We evaluate all methods on a collection of music artists and bands, and show that some heuristics outperform earlier work on the ...
expand
|
||
| Exploiting click-through data for entity retrieval | ||
| Bodo Billerbeck, Gianluca Demartini, Claudiu Firan, Tereza Iofciu, Ralf Krestel | ||
| Pages: 803-804 | ||
| doi>10.1145/1835449.1835624 | ||
|
Full text: |
||
|
We present an approach for answering Entity Retrieval queries using click-through information in query log data from a commercial Web search engine. We compare results using click graphs and session graphs and present an evaluation test set making use ...
expand
|
||
| Feature subset non-negative matrix factorization and its applications to document understanding | ||
| Dingding Wang, Chris Ding, Tao Li | ||
| Pages: 805-806 | ||
| doi>10.1145/1835449.1835625 | ||
|
Full text: |
||
|
In this paper, we propose feature subset non-negative matrix factorization (NMF), which is an unsupervised approach to simultaneously cluster data points and select important features. We apply our proposed approach to various document understanding ...
expand
|
||
| Learning to rank query reformulations | ||
| Van Dang, Michael Bendersky, W. Bruce Croft | ||
| Pages: 807-808 | ||
| doi>10.1145/1835449.1835626 | ||
|
Full text: |
||
|
Query reformulation techniques based on query logs have recently proven to be effective for web queries. However, when initial queries have reasonably good quality, these techniques are often not reliable enough to identify the helpful reformulations ...
expand
|
||
| Many are better than one: improving multi-document summarization via weighted consensus | ||
| Dingding Wang, Tao Li | ||
| Pages: 809-810 | ||
| doi>10.1145/1835449.1835627 | ||
|
Full text: |
||
|
Given a collection of documents, various multi-document summarization methods have been proposed to generate a short summary. However, few studies have been reported on aggregating different summarization methods to possibly generate better summarization ...
expand
|
||
| Exploring the use of labels to shortcut search trails | ||
| Ryen W. White, Raman Chandrasekar | ||
| Pages: 811-812 | ||
| doi>10.1145/1835449.1835628 | ||
|
Full text: |
||
|
Search trails comprising queries and Web page views are created as searchers engage in information-seeking activity online. During known-item search (where the objective may be to locate a target Web page), searchers may waste valuable time repeatedly ...
expand
|
||
| Investigating the suboptimality and instability of pseudo-relevance feedback | ||
| Raghavendra Udupa, Abhijit Bhole | ||
| Pages: 813-814 | ||
| doi>10.1145/1835449.1835629 | ||
|
Full text: |
||
|
Although Pseudo-Relevance Feedback (PRF) techniques improve average retrieval performance at the price of high variance, not much is known about their optimality and the reasons for their instability. In this work, we study more than 800 topics from ...
expand
|
||
| From fusion to re-ranking: a semantic approach | ||
| Annalina Caputo, Pierpaolo Basile, Giovanni Semeraro | ||
| Pages: 815-816 | ||
| doi>10.1145/1835449.1835630 | ||
|
Full text: |
||
|
A number of works have shown that the aggregation of several Information Retrieval (IR) systems works better than each system working individually. Nevertheless, early investigation in the context of CLEF Robust-WSD task, in which semantics is involved, ...
expand
|
||
| High precision opinion retrieval using sentiment-relevance flows | ||
| Seung-Wook Lee, Jung-Tae Lee, Young-In Song, Hae-Chang Rim | ||
| Pages: 817-818 | ||
| doi>10.1145/1835449.1835631 | ||
|
Full text: |
||
|
Opinion retrieval involves the measuring of opinion score of a document about the given topic. We propose a new method, namely sentiment-relevance flow, that naturally unifies the topic relevance and the opinionated nature of a document. Experiments ...
expand
|
||
| Ontology-enriched multi-document summarization in disaster management | ||
| Lei Li, Dingding Wang, Chao Shen, Tao Li | ||
| Pages: 819-820 | ||
| doi>10.1145/1835449.1835632 | ||
|
Full text: |
||
|
In this poster, we propose a novel document summarization approach named Ontology-enriched Multi-Document Summarization(OMS) for utilizing background knowledge to improve summarization results. OMS first maps the sentences of input documents onto an ...
expand
|
||
| Multi-view clustering of multilingual documents | ||
| Young-Min Kim, Massih-Reza Amini, Cyril Goutte, Patrick Gallinari | ||
| Pages: 821-822 | ||
| doi>10.1145/1835449.1835633 | ||
|
Full text: |
||
|
We propose a new multi-view clustering method which uses clustering results obtained on each view as a voting pattern in order to construct a new set of multi-view clusters. Our experiments on a multilingual corpus of documents show that performance ...
expand
|
||
| A stack decoder approach to approximate string matching | ||
| Juan M. Huerta | ||
| Pages: 823-824 | ||
| doi>10.1145/1835449.1835634 | ||
|
Full text: |
||
|
We present a new efficient algorithm for top-N match retrieval of sequential patterns. Our approach is based on an incremental approximation of the string edit distance using index information and a stack based search. Our approach produces hypotheses ...
expand
|
||
| Late fusion of compact composite descriptors for retrieval from heterogeneous image databases | ||
| Savvas A. Chatzichristofis, Avi Arampatzis | ||
| Pages: 825-826 | ||
| doi>10.1145/1835449.1835635 | ||
|
Full text: |
||
|
Compact composite descriptors (CCDs) are global image features, capturing more than one types of information at the same time in a very compact representation. Their quality has so far been evaluated in retrieval from several homogeneous databases containing ...
expand
|
||
| Inferring user intent in web search by exploiting social annotations | ||
| Jose M. Conde, David Vallet, Pablo Castells | ||
| Pages: 827-828 | ||
| doi>10.1145/1835449.1835636 | ||
|
Full text: |
||
|
In this paper, we present a folksonomy-based approach for implicit user intent extraction during a Web search process. We present a number of result re-ranking techniques based on this representation that can be applied to any Web search engine. We perform ...
expand
|
||
| Query term ranking based on dependency parsing of verbose queries | ||
| Jae Hyun Park, W. Bruce Croft | ||
| Pages: 829-830 | ||
| doi>10.1145/1835449.1835637 | ||
|
Full text: |
||
|
Query term ranking approaches are used to select effective terms from a verbose query by ranking terms. Features used for query term ranking and selection in previous work do not consider grammatical relationships between terms. To address this issue, ...
expand
|
||
| A ranking approach to target detection for automatic link generation | ||
| Jiyin He, Maarten de Rijke | ||
| Pages: 831-832 | ||
| doi>10.1145/1835449.1835638 | ||
|
Full text: |
||
|
We focus on the task of target detection in automatic link generation with Wikipedia, i.e., given an N-gram in a snippet of text, find the relevant Wikipedia concepts that explain or provide background knowledge for it. We formulate the task as a ranking ...
expand
|
||
| Probabilistic latent maximal marginal relevance | ||
| Shengbo Guo, Scott Sanner | ||
| Pages: 833-834 | ||
| doi>10.1145/1835449.1835639 | ||
|
Full text: |
||
|
Diversity has been heavily motivated in the information retrieval literature as an objective criterion for result sets in search and recommender systems. Perhaps one of the most well-known and most used algorithms for result set diversification is that ...
expand
|
||
| Using local precision to compare search engines in consumer health information retrieval | ||
| Carla Teixeira Lopes, Cristina Ribeiro | ||
| Pages: 835-836 | ||
| doi>10.1145/1835449.1835640 | ||
|
Full text: |
||
|
We have conducted a user study to evaluate several generalist and health-specific search engines on health information retrieval. Users evaluated the relevance of the top 30 documents of 4 search engines in two different health information needs. We ...
expand
|
||
| multi Searcher: can we support people to get information from text they can't read or understand? | ||
| Farag Ahmed, Andreas Nürnberger | ||
| Pages: 837-838 | ||
| doi>10.1145/1835449.1835641 | ||
|
Full text: |
||
|
The goal of the proposed tool multi Searcher is to answer this research question: can we expect people to be able to get information from text in languages they can not read or understand? The proposed tool multi Searcher provides users with interactive ...
expand
|
||
| Linking wikipedia to the web | ||
| Rianne Kaptein, Pavel Serdyukov, Jaap Kamps | ||
| Pages: 839-840 | ||
| doi>10.1145/1835449.1835642 | ||
|
Full text: |
||
|
We investigate the task of finding links from Wikipedia pages to external web pages. Such external links significantly extend the information in Wikipedia with information from the Web at large, while retaining the encyclopedic organization of Wikipedia. ...
expand
|
||
| Short text classification in twitter to improve information filtering | ||
| Bharath Sriram, Dave Fuhry, Engin Demir, Hakan Ferhatosmanoglu, Murat Demirbas | ||
| Pages: 841-842 | ||
| doi>10.1145/1835449.1835643 | ||
|
Full text: |
||
|
In microblogging services such as Twitter, the users may become overwhelmed by the raw data. One solution to this problem is the classification of short text messages. As short texts do not provide sufficient word occurrences, traditional classification ...
expand
|
||
| A framework for BM25F-based XML retrieval | ||
| Kelly Y. Itakura, Charles L.A. Clarke | ||
| Pages: 843-844 | ||
| doi>10.1145/1835449.1835644 | ||
|
Full text: |
||
|
We evaluate a framework for BM25F-based XML element retrieval. The framework gathers contextual information associated with each XML element into an associated field, which we call a characteristic field. The contents of the element and the contents ...
expand
|
||
| Can search systems detect users' task difficulty?: some behavioral signals | ||
| Jingjing Liu, Chang Liu, Jacek Gwizdka, Nicholas J. Belkin | ||
| Pages: 845-846 | ||
| doi>10.1145/1835449.1835645 | ||
|
Full text: |
||
|
In this paper, we report findings on how user behaviors vary in tasks with different difficulty levels as well as of different types. Two behavioral signals: document dwell time and number of content pages viewed per query, were found to be able to help ...
expand
|
||
| Query log analysis in the context of information retrieval for children | ||
| Sergio Duarte Torres, Djoerd Hiemstra, Pavel Serdyukov | ||
| Pages: 847-848 | ||
| doi>10.1145/1835449.1835646 | ||
|
Full text: |
||
|
In this paper we analyze queries and sessions intended to satisfy children's information needs using a large-scale query log. The aim of this analysis is twofold: i) To identify differences between such queries and sessions, and general queries and sessions; ...
expand
|
||
| Transitive history-based query disambiguation for query reformulation | ||
| Karim Filali, Anish Nair, Chris Leggetter | ||
| Pages: 849-850 | ||
| doi>10.1145/1835449.1835647 | ||
|
Full text: |
||
|
We present a probabilistic model of a user's search history and a target query reformulation. We derive a simple transitive similarity algorithm for disambiguating queries and improving history-based query reformulation accuracy. We compare the merits ...
expand
|
||
| Using flickr geotags to predict user travel behaviour | ||
| Maarten Clements, Pavel Serdyukov, Arjen P. de Vries, Marcel J.T. Reinders | ||
| Pages: 851-852 | ||
| doi>10.1145/1835449.1835648 | ||
|
Full text: |
||
|
We propose a method to predict a user's favourite locations in a city, based on his Flickr geotags in other cities. We define a similarity between the geotag distributions of two users based on a Gaussian kernel convolution. The geotags of the most similar ...
expand
|
||
| Metrics for assessing sets of subtopics | ||
| Filip Radlinski, Martin Szummer, Nick Craswell | ||
| Pages: 853-854 | ||
| doi>10.1145/1835449.1835649 | ||
|
Full text: |
||
|
To evaluate the diversity of search results, test collections have been developed that identify multiple intents for each query. Intents are the different meanings or facets that should be covered in a search results list. This means that topic development ...
expand
|
||
| Learning to select rankers | ||
| Niranjan Balasubramanian, James Allan | ||
| Pages: 855-856 | ||
| doi>10.1145/1835449.1835650 | ||
|
Full text: |
||
|
Combining evidence from multiple retrieval models has been widely studied in the context of of distributed search, metasearch and rank fusion. Much of the prior work has focused on combining retrieval scores (or the rankings) assigned by different retrieval ...
expand
|
||
| VisualSum: an interactive multi-document summarizationsystem using visualization | ||
| Yi Zhang, Dingding Wang, Tao Li | ||
| Pages: 857-858 | ||
| doi>10.1145/1835449.1835651 | ||
|
Full text: |
||
|
Given a collection of documents, most of existing multidocument summarization methods automatically generate a static summary for all the users. However, different users may have different opinions on the documents, thus there is a necessity for improving ...
expand
|
||
| Web page publication time detection and its application for page rank | ||
| Zhumin Chen, Jun Ma, Chaoran Cui, Hongxing Rui, Shaomang Huang | ||
| Pages: 859-860 | ||
| doi>10.1145/1835449.1835652 | ||
|
Full text: |
||
|
Publication Time (P-time for short) of Web pages is often required in many application areas. In this paper, we address the issue of P-time detection and its application for page rank. We first propose an approach to extract P-time for a page with explicit ...
expand
|
||
| HCC: a hierarchical co-clustering algorithm | ||
| Jingxuan Li, Tao Li | ||
| Pages: 861-862 | ||
| doi>10.1145/1835449.1835653 | ||
|
Full text: |
||
|
In this poster, we develop a novel method, called HCC, for hierarchical co-clustering. HCC brings together two interrelated but distinct themes from clustering: hierarchical clustering and co-clustering. The goal of the former theme is to organize clusters ...
expand
|
||
| Retrieval system evaluation: automatic evaluation versus incomplete judgments | ||
| Claudia Hauff, Franciska de Jong | ||
| Pages: 863-864 | ||
| doi>10.1145/1835449.1835654 | ||
|
Full text: |
||
|
In information retrieval (IR), research aiming to reduce the cost of retrieval system evaluations has been conducted along two lines: (i) the evaluation of IR systems with reduced amounts of manual relevance assessments, and (ii) the fully automatic ...
expand
|
||
| Aspect presence verification conditional on other aspects | ||
| Dmitri Roussinov | ||
| Pages: 865-866 | ||
| doi>10.1145/1835449.1835655 | ||
|
Full text: |
||
|
I have shown that the presence of difficult query aspects that are revealed only implicitly (e.g. exploration, opposition, achievements, cooperation, risks) can be improved by taking advantage of the known presence of other, easier to verify query aspects. ...
expand
|
||
| The value of visual elements in web search | ||
| Marilyn Ostergren, Seung-yon Yu, Efthimis N. Efthimiadis | ||
| Pages: 867-868 | ||
| doi>10.1145/1835449.1835656 | ||
|
Full text: |
||
|
We used eye-tracking equipment to observe 36 participants as they performed three search tasks using three graphically-enhanced web search interfaces (Kartoo, SearchMe and Viewzi). In this poster we describe findings of the study focusing on how the ...
expand
|
||
| Diversification of search results using webgraphs | ||
| Praveen Chandar, Ben Carterette | ||
| Pages: 869-870 | ||
| doi>10.1145/1835449.1835657 | ||
|
Full text: |
||
|
A set of words is often insufficient to express a user's information need. In order to account for various information needs associated with a query, diversification seems to be a reasonable strategy. By diversifying the result set, we increase the probability ...
expand
|
||
| Capturing page freshness for web search | ||
| Na Dai, Brian D. Davison | ||
| Pages: 871-872 | ||
| doi>10.1145/1835449.1835658 | ||
|
Full text: |
||
|
Freshness has been increasingly realized by commercial search engines as an important criteria for measuring the quality of search results. However, most information retrieval methods focus on the relevance of page content to given queries without considering ...
expand
|
||
| S-PLASA+: adaptive sentiment analysis with application to sales performance prediction | ||
| Yang Liu, Xiaohui Yu, Xiangji Huang, Aijun An | ||
| Pages: 873-874 | ||
| doi>10.1145/1835449.1835659 | ||
|
Full text: |
||
|
Analyzing the large volume of online reviews would produce useful knowledge that could be of economic values to vendors and other interested parties. In particular, the sentiments expressed in the online reviews have been shown to be strongly correlated ...
expand
|
||
| Supervised query modeling using wikipedia | ||
| Edgar Meij, Maarten de Rijke | ||
| Pages: 875-876 | ||
| doi>10.1145/1835449.1835660 | ||
|
Full text: |
||
|
We use Wikipedia articles to semantically inform the generation of query models. To this end, we apply supervised machine learning to automatically link queries to Wikipedia articles and sample terms from the linked articles to re-estimate the query ...
expand
|
||
| A two-stage model for blog feed search | ||
| Wouter Weerkamp, Krisztian Balog, Maarten de Rijke | ||
| Pages: 877-878 | ||
| doi>10.1145/1835449.1835661 | ||
|
Full text: |
||
|
We consider blog feed search: identifying relevant blogs for a given topic. An individual's search behavior often involves a combination of exploratory behavior triggered by salient features of the information objects being examined plus goal-directed ...
expand
|
||
| Machine learned ranking of entity facets | ||
| Roelof van Zwol, Lluís Garcia Pueyo, Mridul Muralidharan, Börkur Sigurbjörnsson | ||
| Pages: 879-880 | ||
| doi>10.1145/1835449.1835662 | ||
|
Full text: |
||
|
The research described in this paper forms the backbone of a service that enables the faceted search experience of the Yahoo! search engine. We introduce an approach for a machine learned ranking of entity facets based on user click feedback and features ...
expand
|
||
| User comments for news recommendation in social media | ||
| Jia Wang, Qing Li, Yuanzhu Peter Chen | ||
| Pages: 881-882 | ||
| doi>10.1145/1835449.1835663 | ||
|
Full text: |
||
|
Reading and Commenting online news is becoming a common user behavior in social media. Discussion in the form of comments following news postings can be effectively facilitated if the service provider can recommend articles based on not only the original ...
expand
|
||
| Incorporating global information into named entity recognition systems using relational context | ||
| Yuval Merhav, Filipe Mesquita, Denilson Barbosa, Wai Gen Yee, Ophir Frieder | ||
| Pages: 883-884 | ||
| doi>10.1145/1835449.1835664 | ||
|
Full text: |
||
|
The state-of-the-art in Named Entity Recognition relies on a combination of local features of the text and global knowledge to determine the types of the recognized entities. This is problematic in some cases, resulting in entities being classified as ...
expand
|
||
| Achieving high accuracy retrieval using intra-document term ranking | ||
| Hyun-Wook Woo, Jung-Tae Lee, Seung-Wook Lee, Young-In Song, Hae-Chang Rim | ||
| Pages: 885-886 | ||
| doi>10.1145/1835449.1835665 | ||
|
Full text: |
||
|
Most traditional ranking models roughly score the relevance of a given document by observing simple term statistics, such as the occurrence of query terms within the document or within the collection. Intuitively, the relative importance of query terms ...
expand
|
||
| Author interest topic model | ||
| Noriaki Kawamae | ||
| Pages: 887-888 | ||
| doi>10.1145/1835449.1835666 | ||
|
Full text: |
||
|
This paper presents a hierarchical topic model that simultaneously captures topics and author's interests. Our proposal, the Author Interest Topic model (AIT), introduces a latent variable with a separate probability distribution over topics into each ...
expand
|
||
| On the relationship between effectiveness and accessibility | ||
| Leif Azzopardi, Richard Bache | ||
| Pages: 889-890 | ||
| doi>10.1145/1835449.1835667 | ||
|
Full text: |
||
|
Typically the evaluation of Information Retrieval (IR) systems is focused upon two main system attributes: efficiency and effectiveness. However, it has been argued that it is also important to consider accessibility, i.e. the extent to which the IR ...
expand
|
||
| Visual concept-based selection of query expansions for spoken content retrieval | ||
| Stevan Rudinac, Martha Larson, Alan Hanjalic | ||
| Pages: 891-892 | ||
| doi>10.1145/1835449.1835668 | ||
|
Full text: |
||
|
In this paper we present a novel approach to semantic-theme-based video retrieval that considers entire videos as retrieval units and exploits automatically detected visual concepts to improve the results of retrieval based on spoken content. We deploy ...
expand
|
||
| Mining adjacent markets from a large-scale ads video collection for image advertising | ||
| Guwen Feng, Xin-Jing Wang, Lei Zhang, Wei-Ying Ma | ||
| Pages: 893-894 | ||
| doi>10.1145/1835449.1835669 | ||
|
Full text: |
||
|
The research on image advertising is still in its infancy. Most previous approaches suggest ads by directly matching an ad to a query image, which lacks the power to identify ads from adjacent market. In this paper, we tackle the problem by mining knowledge ...
expand
|
||
| A co-learning framework for learning user search intents from rule-generated training data | ||
| Jun Yan, Zeyu Zheng, Li Jiang, Yan Li, Shuicheng Yan, Zheng Chen | ||
| Pages: 895-896 | ||
| doi>10.1145/1835449.1835670 | ||
|
Full text: |
||
|
Learning to understand user search intents from their online behaviors is crucial for both Web search and online advertising. However, it is a challenging task to collect and label a sufficient amount of high quality training data for various user intents ...
expand
|
||
| Learning the click-through rate for rare/new ads from similar ads | ||
| Kushal S. Dave, Vasudeva Varma | ||
| Pages: 897-898 | ||
| doi>10.1145/1835449.1835671 | ||
|
Full text: |
||
|
Ads on the search engine (SE) are generally ranked based on their Click-through rates (CTR). Hence, accurately predicting the CTR of an ad is of paramount importance for maximizing the SE's revenue. We present a model that inherits the click information ...
expand
|
||
| Graphical models for text: a new paradigm for text representation and processing | ||
| Charu Aggarwal, Peixiang Zhao | ||
| Pages: 899-900 | ||
| doi>10.1145/1835449.1835672 | ||
|
Full text: |
||
|
Almost all text applications use the well known vector-space model for text representation and analysis. While the vector-space model has proven itself to be an effective and efficient representation for mining purposes, it does not preserve information ...
expand
|
||
| A survival modeling approach to biomedical search result diversification using wikipedia | ||
| Xiaoshi Yin, Jimmy Xiangji Huang, Xiaofeng Zhou, Zhoujun Li | ||
| Pages: 901-902 | ||
| doi>10.1145/1835449.1835673 | ||
|
Full text: |
||
|
In this paper, we propose a probabilistic survival model derived from the survival analysis theory for measuring aspect novelty. The retrieved documents' query-relevance and novelty are combined at the aspect level for re-ranking. Experiments conducted ...
expand
|
||
| TUTORIAL SESSION: Tutorials | ||
| Low cost evaluation in information retrieval | ||
| Ben Carterette, Evangelos Kanoulas, Emine Yilmaz | ||
| Pages: 903-903 | ||
| doi>10.1145/1835449.1835675 | ||
|
Full text: |
||
|
Search corpora are growing larger and larger: over the last 10 years, the IR research community has moved from the several hundred thousand documents on the TREC disks to the tens of millions of U.S. government web pages of GOV2 to the one billion general-interest ...
expand
|
||
| Learning to rank for information retrieval | ||
| Tie-Yan Liu | ||
| Pages: 904-904 | ||
| doi>10.1145/1835449.1835676 | ||
|
Full text: |
||
|
This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, ...
expand
|
||
| Introduction to probabilistic models in IR | ||
| Victor P. Lavrenko | ||
| Pages: 905-905 | ||
| doi>10.1145/1835449.1835677 | ||
|
Full text: |
||
|
Most of today's state-of-the-art retrieval models, including BM25 and language modeling, are grounded in probabilistic principles. Having a working understanding of these principles can help researchers understand existing retrieval models better and ...
expand
|
||
| Multimedia information retrieval | ||
| Stefan Rueger | ||
| Pages: 906-906 | ||
| doi>10.1145/1835449.1835678 | ||
|
Full text: |
||
|
This tutorial is concerned with creating the best possible multimedia search experience. The intriguing bit here is that the query itself can be a multimedia excerpt: For example, when you walk around in an unknown place and stumble across an interesting ...
expand
|
||
| Web retrieval: the role of users | ||
| Ricardo Baeza-Yates, Yoelle Maarek | ||
| Pages: 907-907 | ||
| doi>10.1145/1835449.1835679 | ||
|
Full text: |
||
|
Web retrieval methods have evolved through three major steps in the last decade or so. They started from standard document-centric IR in the early days of the Web, then made a major step forward by leveraging the structure of the Web, using link analysis ...
expand
|
||
| Information retrieval challenges in computational advertising | ||
| Andrei Broder, Evgeniy Gabrilovich, Vanja Josifovski | ||
| Pages: 908-908 | ||
| doi>10.1145/1835449.1835680 | ||
|
Full text: |
||
|
Computational advertising is an emerging scientific sub-discipline, at the intersection of large scale search and text analysis, information retrieval, statistical modeling, machine learning, classification, optimization, and microeconomics. The central ...
expand
|
||
| Extraction of open-domain class attributes from text: building blocks for faceted search | ||
| Marius Pasca | ||
| Pages: 909-909 | ||
| doi>10.1145/1835449.1835681 | ||
|
Full text: |
||
|
Knowledge automatically extracted from text captures instances, classes of instances and relations among them. In particular, the acquisition of class attributes (e.g., "top speed", "body style" and "number of cylinders" for the class of "sports cars") ...
expand
|
||
| From federated to aggregated search | ||
| Fernando Diaz, Mounia Lalmas, Milad Shokouhi | ||
| Pages: 910-910 | ||
| doi>10.1145/1835449.1835682 | ||
|
Full text: |
||
|
Federated search refers to the brokered retrieval of content from a set of auxiliary retrieval systems instead of from a single, centralized retrieval system. Federated search tasks occur in, for example, digital libraries (where documents from several ...
expand
|
||
| Estimating the query difficulty for information retrieval | ||
| David Carmel, Elad Yom-Tov | ||
| Pages: 911-911 | ||
| doi>10.1145/1835449.1835683 | ||
|
Full text: |
||
|
Many information retrieval (IR) systems suffer from a radical variance in performance when responding to users' queries. Even for systems that succeed very well on average, the quality of results returned for some of the queries is poor. Thus, it is ...
expand
|
||
| Search and browse log mining for web information retrieval: challenges, methods, and applications | ||
| Daxin Jiang, Jian Pei, Hang Li | ||
| Pages: 912-912 | ||
| doi>10.1145/1835449.1835684 | ||
|
Full text: |
||
|
Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be ...
expand
|
||
| Information retrieval for e-discovery | ||
| David D. Lewis | ||
| Pages: 913-913 | ||
| doi>10.1145/1835449.1835685 | ||
|
Full text: |
||
|
Discovery, the process under which parties to legal cases must reveal documents relevant to the disputed issues is a core aspect of trials in the United States, and a lesser but important factor in other countries. Discovery on documents stored in computerized ...
expand
|
||
| SESSION: Doctoral consortium | ||
| On the mono- and cross-language detection of text reuse and plagiarism | ||
| Alberto Barrón-Cedeño | ||
| Pages: 914-914 | ||
| doi>10.1145/1835449.1835687 | ||
|
Full text: |
||
|
Plagiarism, the unacknowledged reuse of text, has increased in recent years due to the large amount of texts readily available. For instance, recent studies claim that nowadays a high rate of student reports include plagiarism, making manual plagiarism ...
expand
|
||
| User interface designs to support the social transfer of web search expertise | ||
| Neema Moraveji | ||
| Pages: 915-915 | ||
| doi>10.1145/1835449.1835688 | ||
|
Full text: |
||
|
While there are many ways to develop search expertise, I maintain that most members of the general public do so in an inefficient manner. One reason is that, with current tools, is difficult to observe experts as a means of acquiring search expertise ...
expand
|
||
| Leveraging user interaction and collaboration for improving multilingual information access in digital libraries | ||
| Juliane Stiller | ||
| Pages: 916-916 | ||
| doi>10.1145/1835449.1835689 | ||
|
Full text: |
||
|
The goal of interactive cross-lingual information retrieval systems is to support users in formulating effective queries and selecting the documents which satisfy their information needs regardless of the language of these documents. This dissertation ...
expand
|
||
| Entity information management in complex networks | ||
| Yi Fang | ||
| Pages: 917-917 | ||
| doi>10.1145/1835449.1835690 | ||
|
Full text: |
||
|
Entity information management (EIM) deals with organizing, processing and delivering information about entities. Its emergence is a result of satisfying more sophisticated information needs that go beyond document search. In the recent years, entity ...
expand
|
||
| Finding people and their utterances in social media | ||
| Wouter Weerkamp | ||
| Pages: 918-918 | ||
| doi>10.1145/1835449.1835691 | ||
|
Full text: |
||
|
Since its introduction, social media, "a group of internet-based applications that (...) allow the creation and exchange of user generated content" [1], has attracted more and more users. Over the years, many platforms have arisen that allow users to ...
expand
|
||
| Leveraging user-generated content for news search | ||
| Richard M.C. McCreadie | ||
| Pages: 919-919 | ||
| doi>10.1145/1835449.1835692 | ||
|
Full text: |
||
|
Over the last few years both availability and accessibility of current news stories on the Web have dramatically improved. In particular, users can now access news from a variety of sources hosted on the Web, from newswire presences such as the New York ...
expand
|
||
| User centered story tracking | ||
| Ilija Subasic | ||
| Pages: 920-920 | ||
| doi>10.1145/1835449.1835693 | ||
|
Full text: |
||
|
Using data collections available on the Internet has for many people became the main medium for staying informed about the world. Many of these collections are in nature dynamic, evolving as the subjects they describe change. The goal of different research ...
expand
|
||
| Reverse annotation based retrieval from large document image collections | ||
| Pramod Sankar K. | ||
| Pages: 921-921 | ||
| doi>10.1145/1835449.1835694 | ||
|
Full text: |
||
|
A number of projects are dedicated to creating digital libraries from scanned books, such as Google Books, UDL, Digital Library of India (DLI), etc. The ability to search in the content of document images is essential for the usability and popularity ...
expand
|
||
| Learning hidden variable models for blog retrieval | ||
| Mengqiu Wang | ||
| Pages: 922-922 | ||
| doi>10.1145/1835449.1835695 | ||
|
Full text: |
||
|
We describe probabilistic models that leverage individual blog post evidence to improve blog seed retrieval performances. Our model offers a intuitive and principled method to combine multiple posts in scoring a whole blog site by treating individual ...
expand
|
||
| Investigation on smoothing and aggregation methods in blog retrieval | ||
| Mostafa Keikha | ||
| Pages: 923-923 | ||
| doi>10.1145/1835449.1835696 | ||
|
Full text: |
||
|
Recently, user generated data is growing rapidly and becoming one of the most important source of information in the web. Blogosphere (the collection of blogs on the web) is one of the main source of information in this category. In my work for my PhD, ...
expand
|
||
| Aiming for user experience in information retrieval: towards user-centered relevance (UCR) | ||
| Frans van der Sluis, Betsy. van Dijk, Egon L. van den Broek | ||
| Pages: 924-924 | ||
| doi>10.1145/1835449.1835697 | ||
|
Full text: |
||
Welcome to the 33rd ACM SIGIR International Conference of Research and Development on Information Retrieval. SIGIR 2010 has attracted a record-breaking number of papers signalling once again the importance of information retrieval research. We continue to see a steady growth of research output as well as a growing diversity of subjects in our field, where emerging topics, such as learning to rank, social media search, query logs analysis, recommender systems or advertising and search, are now reaching a relative maturity. This year we observe a continued interest in foundational aspects of IR, such as IR theory and evaluation studies, and also a growing interest on traditional topics, such a clustering and classification. If we want to summarize this SIGIR conference with a single word or phrase, we can suggest "users" or "users and queries" indicating the importance of users in search. But, we will let you discover this for yourself while delving through these conference proceedings.
There were 520 full paper submissions representing the work of IR researchers in more than 39 countries. Of these, 87 (16.7%) were accepted, representing the different geographic areas as follows: 42 from the Americas, 25 from Europe -- Africa and 20 from Asia -- Pacific. In addition to the full papers, a further 5 were offered the opportunity of presentation as posters. There were 90 (30.7%) posters, 10 (50%) demonstrations, 11 (52%) tutorials and 9 (50%) workshops accepted for inclusion in the technical program. A doctoral consortium with 11 PhD candidates is also part of the technical program. It is worth noting that more than half of the submitted papers (293, or 56%) have a student as the first author, an encouraging sign for the growth and vitality of the IR community. We are grateful to the keynote speakers Donna Harman from NIST, and Gary Flake from Microsoft, who agreed to share their ideas with the community.